PEMS4HGT の履歴(No.1)


Bioinformatics Laboratory

  • Application example of PEMS: Detection of HGT candidates

How to use PEMS for detecting HGT candidates in microbial genomes and predicting their origin by using large-scale BLSOM results.

By applying our software PEMS, you can detect horizontal propagation candidates against your private genome sequence.
PEMS requires Microsoft windows 7 or more and Microsoft .NET Framework 4.0 or better runtime environmet.
Please check README for PC performance end environment.

Please download PEMS for HGT here (PEMS4HGT.zip).

The analysis procedure is as follows.

1.Segmentation of your private genome sequence by a 5-kb window with a 1-kb step.
You can fragment the genome sequence using software such as EMBOSS splitter (
ex.: http://bioinfo.nhri.org.tw/cgi-bin/emboss/splitter) (Fig 1).

	Figure 1. Input form of EMBOSS splitter

For example, if you want segments by a 5-kb window with a 1-kb step, execute after inputting the following options.

5000 on “Size to split at” in Fig. 1.
4000 on “Overlap between split sequences” in Fig. 1.

After execution, save the output results to a file.
At that time, please set the file extension to “.fa”, “*.fna” or “.fas”.

Please refer to the file created by E. coli K-12 strains (acsession number: U00096) in the “sampledata” folder.

2.PEMS is executed using the prepared genome sequence seqments data created in FASTA format as input data.

Here, the simple execution method is introduced. Please check README for the detailed execution method.

1. Click “Multi Fasta” for input of the prepared genome sequence segment file ( “1 “ in Fig. 2).
2. Click Threshold for setting threshold value.
3. Change threshold value from “40” to “0”.
4. Click “start

	At that time, create a folder to save the output files and specify prefix of the output files.

3.Output files
The two files used mainly are described below.

  「PREFIX_Top.txt」:Taxonomic assignment results of the Kingdom/Phylum/Genus in each sequence segment.
   「PREFIX_Hist.txt」:The counting result of the number that have been assigned to each category in each Kingdom/Phylum/Genus.

Detailed results for each taxonomic rank are output to "PREFIX_Alphaproteobacteria_All.txt" in the case of Alphaproteobacteria.
If you want to improve futher, you can change the assignment criteria using these files.