How to find single copy family from InParanoid output
4 weeks ago

I have tried three different methods to find single copy gene family. The first two are OrthoFinder & OrthoMCL. The third method I choose was InParanoid. I down load InParanoid 4.2 from https://bitbucket.org/sonnhammergroup/inparanoid4/src/master/. And successfully run the program.

200 sequences in file EC
200 sequences in file SC
120 sequences EC have homologs in dataset SC
79 sequences SC have homologs in dataset EC
221 EC-EC matches
135 SC-SC matches
###################################
25 groups of orthologs
62 in-paralogs from EC
47 in-paralogs from SC
Grey zone 0 bits
Score cutoff 40 bits
In-paralogs with confidence less than 0.05 not shown
Sequence overlap cutoff 0.5
Group merging cutoff 0.5
Scoring matrix BLOSUM62
###################################
___________________________________________________________________________________
Group of orthologs #1. Best score 590 bits
Score difference with first non-orthologous sequence - EC:590   SC:435
LEU2_ECOLI              100.00%         LEU2_YEAST              100.00%
___________________________________________________________________________________
Group of orthologs #2. Best score 467 bits
Score difference with first non-orthologous sequence - EC:365   SC:396
___________________________________________________________________________________
Group of orthologs #3. Best score 463 bits
Score difference with first non-orthologous sequence - EC:463   SC:463
6PGD_ECOLI              100.00%         6PG1_YEAST              100.00%
6PG9_ECOLI              90.28%          6PG2_YEAST              81.59%
___________________________________________________________________________________
Group of orthologs #4. Best score 446 bits
Score difference with first non-orthologous sequence - EC:446   SC:13
FTSH_ECOLI              100.00%         YME1_YEAST              100.00%
___________________________________________________________________________________
Group of orthologs #5. Best score 411 bits
Score difference with first non-orthologous sequence - EC:145   SC:411
___________________________________________________________________________________
Group of orthologs #6. Best score 339 bits
Score difference with first non-orthologous sequence - EC:74   SC:65
LYSP_ECOLI              100.00%         CAN1_YEAST              100.00%
ALP1_YEAST              52.69%
LYP1_YEAST              48.95%
DIP5_YEAST              5.39%
___________________________________________________________________________________


Now, I obtained orthologs information. But, I don't know how find the single copy gene family. I guess the first group of orthologs - LEU2_ECOLI & LEU2_YEAST - was a single copy gene family. I don't know if what I think is right.

Could anybody please suggest, how to find single copy gene family from InParanoid, especially in analysis of many species.

Best Wishes !

InParanoid single copy • 254 views
I'd just use OrthoFinder. It outputs a separate directory of single copy orthologs if any such (ortho)groups exist in your data set.

I also used OrthoFinder, but I‘m worried that the result of only using Orthofinder is inaccurate. So I want to find the consensus list between several tools (Orthofinder, OrthoMCL, TreeFam, InParanoid ...).

In fact, I wanted to use TreeFam method at first. However, I didn't know the pipeline of TreeFam method at all. So, I chose InParanoid, which looks easier than TreeFam method

I don't think you need to pool the results of multiple methods here. If you're using the latest version of OrthoFinder you're getting the best results from among any of the tools you've mentioned. The only thing you'll need to worry about is making sure that you're providing OrthoFinder the right species tree (if the trees it has estimated on its own appear to be incorrect).