PEMS4HGT の履歴差分(No.2)


  • 追加された行はこの色です。
  • 削除された行はこの色です。
[[Bioinformatics Laboratory]]
[[Bioinformatics Lab. English]]
[[About PEMS>PEMS_Soft_e]]

-Application example of PEMS: Detection of HGT candidates
*Application example of PEMS: Detection of HGT candidates [#lebea126]

How to use PEMS for detecting HGT candidates in microbial genomes and predicting their origin by using large-scale BLSOM results.


By applying our software PEMS, you can detect horizontal propagation candidates against your private genome sequence.
By applying our software PEMS, you can detect HGT candidates against your private genome sequence.

PEMS requires Microsoft windows 7 or more and Microsoft .NET Framework 4.0 or better runtime environmet.
Please check README for PC performance end environment.
Please check PEMS user guide for PC performance end environment.

Please download PEMS for HGT here (PEMS4HGT.zip).
-Please download PEMS for HGT here ([[PEMS4HGT.zip, ~4.6Gbyte>http://bioinfo.ie.niigata-u.ac.jp/PEMS4HGT.zip]]).
-[[PEMS user guide>http://bioinfo.ie.niigata-u.ac.jp/PEMS_UserGuide-e.pdf]]

The analysis procedure is as follows.
**The analysis procedure is as follows. [#zc6c62e7]

1.Segmentation of your private genome sequence by a 5-kb window with a 1-kb step.
**1.Segmentation of your private genome sequence by a 5-kb window with a 1-kb step. [#k2c266da]
You can fragment the genome sequence using software such as EMBOSS splitter (
ex.: http://bioinfo.nhri.org.tw/cgi-bin/emboss/splitter) (Fig 1).

 
	Figure 1. Input form of EMBOSS splitter
#ref(Fig1.PNG,left,nowrap,Fig1,70%) 
Figure 1. Input form of EMBOSS splitter

For example, if you want segments by a 5-kb window with a 1-kb step, execute after inputting the following options.

5000 on “Size to split at” in Fig. 1.
4000 on “Overlap between split sequences” in Fig. 1.
--5000 on “Size to split at” in Fig. 1.
--4000 on “Overlap between split sequences” in Fig. 1.

After execution, save the output results to a file. 
At that time, please set the file extension to “.fa”, “*.fna” or “.fas”.
At that time, please set the file extension to “.fa”, “.fna” or “.fas”.

Please refer to the file created by E. coli K-12 strains (acsession number: U00096) in the “sampledata” folder.
Please refer to the file created by E. coli K-12 strains (accession number: U00096) in the “sampledata” folder.

2.PEMS is executed using the prepared genome sequence seqments data created in FASTA format as input data.
**2. PEMS is executed using the prepared genome sequence seqments data created in FASTA format as input data. [#za1bcd80]

Here, the simple execution method is introduced. Please check README for the detailed execution method.
Here, the simple execution method is introduced. Please check PEMS user guide for the detailed execution method.

1. Click “Multi Fasta” for input of the prepared genome sequence segment file ( “1 “ in Fig. 2).
2. Click Threshold for setting threshold value.
3. Change threshold value from “40” to “0”.
4. Click “start
	At that time, create a folder to save the output files and specify prefix of the output files.
#ref(Fig2.PNG,left,nowrap,Fig2,70%) 
Figure 2. PEMS input screen

3.Output files

+Click “Multi Fasta” for input of the prepared genome sequence segment file ( “1 “ in Fig. 2).
+PClick Threshold for setting threshold value.
+Change threshold value from “40” to “0”.
This threshold means the percentage against the most abundant taxonomic rank when microbial genomic segments were mapped into taxonomic territories. 
+Click “start”
At that time, create a folder to save the output files and specify prefix of the output files.

**3.Output files [#i608efb4]
The two files used mainly are described below.
   「PREFIX_Top.txt」:Taxonomic assignment results of the Kingdom/Phylum/Genus in each sequence segment.
    「PREFIX_Hist.txt」:The counting result of the number that have been assigned to each category in each Kingdom/Phylum/Genus.
+[PREFIX_Top.txt]:Taxonomic assignment results of the Kingdom/Phylum/Genus in each sequence segment.
+[PREFIX_Hist.txt]:The counting result of the number that have been assigned to each category in each Kingdom/Phylum/Genus.
+Detailed results for each taxonomic rank are output to "PREFIX_Alphaproteobacteria_All.txt" in the case of Alphaproteobacteria. If you want to improve futher, you can change the assignment criteria using these files.

Detailed results for each taxonomic rank are output to "PREFIX_Alphaproteobacteria_All.txt" in the case of Alphaproteobacteria.
If you want to improve futher, you can change the assignment criteria using these files.
Please see the README for details of the output file.
*Reference [#t2e362a1]
+Takashi Abe, Shigehiko Kanaya, Makoto Kinouchi, Yuta Ichiba, Tokio Kozuki and Toshimichi Ikemura. Informatics for unveiling hidden genome signatures. Genome Research, 13, 693-702, 2003.
+Takashi Abe, Hideaki Sugawara, Makoto Kinouchi, Shigehiko Kanaya and Toshimichi Ikemura. Novel Phylogenetic Studies of Genomic Sequence Fragments Derived from Uncultured Microbe Mixtures in Environmental and Clinical Samples. DNA Research, 12, 281-290, 2005.
+Takashi Abe, Shigehiko Kanaya, Hiroshi Uehara and Toshimichi Ikemura. A novel bioinformatics strategy for function prediction of poorly-characterized protein genes obtained from metagenome analyses. DNA Research, 16, 287-298, 2009. 
+Hiroshi Uehara, Yuki Iwasaki, Chieko Wada, Kennosuke Wada, Toshimichi Ikemura and Takashi Abe. A novel bioinformatics strategy for searching industrially useful genome resources from metagenomic sequence libraries. Genes & Genetic Systems, 86, 53-66, 2011.
+Ryo Nakao, Takashi Abe, Ard M. Nijhof, Seigo Yamamoto, Frans Jongejan, Toshimichi Ikemura, Chihiro Sugimoto. A novel approach, based on BLSOMs (Batch Learning Self-Organizing Maps), to the microbiome analysis of ticks. ISME Journal, 7, 1003-1015, 2013.
#br
#counter