proteinOrtho out put help required
1
0
Entering edit mode
4.0 years ago
majeedaasim ▴ 60

I installed and ran proteinortho successfully. Next I used grab_proteins.pl to extract sequences of the identified orthologous groups. I get 32 fasta files on test data provided with the software. Does it mean that there are 32 orthologous groups detected by the software? Also I need to find the the sequences common to all the species for constructing the phylogenetic tree, Which tool can I use to find shared orthology among the different species?

orthology proteinortho • 2.1k views
0
Entering edit mode
1
Entering edit mode
4.0 years ago
Jon ▴ 340

You need to parse the file tab delimited file named .proteinortho or .poff (if you used syteny). Per the manual, you'll see that the first column is the number of species that the orthologous group is found in. So lets say you had 4 species (i.e. you ran with 4 proteome fasta files), you could then filter the tab-delimited .proteinortho or .poff output as follows:

#this will show orthologous groups containing proteins from each species
grep '^4' output.proteinortho


However, since it seems like you are maybe looking for single copy orthologs, you could filter those like:

#this will get orthologs containing all 4 species and no duplications in each genome
grep '^4\t4' output.proteinortho

0
Entering edit mode

I have the same problem and I can not extract the orthologs containing all species and no duplications in each genome to do the phylogeny. do you have a tool that can extract orthology sequences????