Question: Find orthologs using EggNOG and local HMMER search for multiple transcripts
2
gravatar for Farbod
4.3 years ago by
Farbod3.3k
Toronto
Farbod3.3k wrote:

Dear Friends Hi ('m not native in English!)

I want to search for orthologs locally using EggNOG in my fish transcriptome samples.

First I have download fiNOG which is specially for fishes from EggNOG,

then built a HMMER database using hmmpress (i.e. cat fiNOG_hmm/*.hmm > fishDB.hmmer), and run hmmpress fishDB.hmmer.

then I intend to run such script : hmmscan --cpu 24 '/home/fiNOG_hmm/fishDB.hmmer' '/home/Transcriptome.fasta'

My question: Can I use my transcriptome assembly .fasta file directly in this script or I must convert (translate) it into protein in advance (which tool is better for this job? Transdecoder ? )?

Thank you

hmmer gene rna-seq eggnog orthologs • 2.9k views
ADD COMMENTlink modified 4.3 years ago by jhc2.9k • written 4.3 years ago by Farbod3.3k
3
gravatar for jhc
4.3 years ago by
jhc2.9k
Spain
jhc2.9k wrote:

There is now a better resource for functional annotation using eggNOG orthology. Check these links:

Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper

ADD COMMENTlink written 4.3 years ago by jhc2.9k

Dear jhc, Hi and thank you for your guidance.

It seems that the http://eggnog-mapper.embl.de (online tool) is not suitable for nucleotide seqs, yes?

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Farbod3.3k
1

I just added this option. Nucleotide sequences should now be accepted and automatically translated.

ADD REPLYlink written 4.3 years ago by jhc2.9k

Thank you jaime,

Yesterday I used the online module and feed it some transcript seqs and used the fiNOG, but unfortunately the result file was empty, I guessed that the online tool did not convert my nucleotide sequence to protein.

ADD REPLYlink written 4.3 years ago by Farbod3.3k
1

no, that option was not enabled yesterday. But, even now, upload protein sequences if possible. translation is automatic and if the CDS sequence is not correctly formatted you will miss annotations.

ADD REPLYlink written 4.3 years ago by jhc2.9k

Hi again,

Now I have used about 1353 fish transcripts and This time the annotation result contain 144 result (So, I think that works - but I hope more annotation than 144! )

So your suggestion is this that I used the command line with the --transcript option ?

this is the head of my results:

TRINITY_DN246193_c0_g1_i1   7955.ENSDARP00000098851 1.2e-27 85.1    FAM59B          fiNOG[1]    0A18E@biNOG,0DR6J@chorNOG,0II0U@euNOG,0N2VQ@fiNOG,0VC44@meNOG,0XSPW@NOG,12VKE@opiNOG,1CU2B@veNOG    0N2VQ|1.2e-26|93.4  S   family with sequence similarity 59, member B
TRINITY_DN96276_c0_g3_i5    8083.ENSXMAP00000019246 1.2e-177    582.0   ERCC5       map03420    fiNOG[1]    09TAU@biNOG,0DEN5@chorNOG,0N0I4@fiNOG,0V49Z@meNOG,12N9G@opiNOG,1CPC1@veNOG,COG0258@NOG,KOG2520@euNOG    0N0I4|1.3e-226|755.8    L   Excision repair cross-complementing rodent repair deficiency, complementation group 5
TRINITY_DN97247_c0_g1_i10   7955.ENSDARP00000071650 1e-57   184.2   PREB        map04141    fiNOG[1]    09USQ@biNOG,0DG2A@chorNOG,0MX1Q@fiNOG,0V5S6@meNOG,0XRQK@NOG,12SXV@opiNOG,1CQRT@veNOG,KOG0771@euNOG  0MX1Q|1.7e-59|202.2 U   prolactin regulatory element binding
ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Farbod3.3k
1

Most probably, many of your nucleotide seqs are not in the correct frame or are not the complete CDS. My suggestion is: translate the set of nucleotide sequences using a dedicated tool, review that the protein products make sense, and upload the protein set to eggnog mapper.

ADD REPLYlink written 4.3 years ago by jhc2.9k

OK. thank you for your help

Do you have any suggestion for a good "translator tools" ? something that I could input all my transcripts to in and give the protein translation as output (something same as Transdecoder).

~ Best

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Farbod3.3k

HI,

Which methods would you use for finding orthologs of a particular protein.

Single protein (e.g. P01112) - find orthologs?

I wish to do this offline, with local databases - which resources are best to use here?

Thanks, U.

ADD REPLYlink written 2.8 years ago by urema0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1528 users visited in the last hour
_