Question

Find orthologs using EggNOG and local HMMER search for multiple transcripts

2

Entering edit mode

7.5 years ago

Farbod ★ 3.4k

Dear Friends Hi ('m not native in English!)

I want to search for orthologs locally using EggNOG in my fish transcriptome samples.

First I have download fiNOG which is specially for fishes from EggNOG,

then built a HMMER database using hmmpress (i.e. cat fiNOG_hmm/*.hmm > fishDB.hmmer), and run hmmpress fishDB.hmmer.

then I intend to run such script : hmmscan --cpu 24 '/home/fiNOG_hmm/fishDB.hmmer' '/home/Transcriptome.fasta'

My question: Can I use my transcriptome assembly .fasta file directly in this script or I must convert (translate) it into protein in advance (which tool is better for this job? Transdecoder ? )?

Thank you

RNA-Seq EggNOG HMMER orthologs gene • 4.9k views

ADD COMMENT • link updated 7.5 years ago by jhc ★ 3.0k • written 7.5 years ago by Farbod ★ 3.4k

score 3 · Accepted Answer · 2016-10-17

3

Entering edit mode

7.5 years ago

jhc ★ 3.0k

There is now a better resource for functional annotation using eggNOG orthology. Check these links:

Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper

http://eggnog-mapper.embl.de (online tool)
http://biorxiv.org/content/early/2016/09/22/076331 (method description and benchmark)
https://github.com/jhcepas/eggnog-mapper (eggnog-mapper tool. Use the option --translate if using nucleotide seqs)

ADD COMMENT • link 7.5 years ago by jhc ★ 3.0k

0

Entering edit mode

Dear jhc, Hi and thank you for your guidance.

It seems that the http://eggnog-mapper.embl.de (online tool) is not suitable for nucleotide seqs, yes?

ADD REPLY • link 7.5 years ago by Farbod ★ 3.4k

1

Entering edit mode

I just added this option. Nucleotide sequences should now be accepted and automatically translated.

ADD REPLY • link 7.5 years ago by jhc ★ 3.0k

0

Entering edit mode

Thank you jaime,

Yesterday I used the online module and feed it some transcript seqs and used the fiNOG, but unfortunately the result file was empty, I guessed that the online tool did not convert my nucleotide sequence to protein.

ADD REPLY • link 7.5 years ago by Farbod ★ 3.4k

1

Entering edit mode

no, that option was not enabled yesterday. But, even now, upload protein sequences if possible. translation is automatic and if the CDS sequence is not correctly formatted you will miss annotations.

ADD REPLY • link 7.5 years ago by jhc ★ 3.0k

0

Entering edit mode

Hi again,

Now I have used about 1353 fish transcripts and This time the annotation result contain 144 result (So, I think that works - but I hope more annotation than 144! )

So your suggestion is this that I used the command line with the --transcript option ?

this is the head of my results:

TRINITY_DN246193_c0_g1_i1   7955.ENSDARP00000098851 1.2e-27 85.1    FAM59B          fiNOG[1]    0A18E@biNOG,0DR6J@chorNOG,0II0U@euNOG,0N2VQ@fiNOG,0VC44@meNOG,0XSPW@NOG,12VKE@opiNOG,1CU2B@veNOG    0N2VQ|1.2e-26|93.4  S   family with sequence similarity 59, member B
TRINITY_DN96276_c0_g3_i5    8083.ENSXMAP00000019246 1.2e-177    582.0   ERCC5       map03420    fiNOG[1]    09TAU@biNOG,0DEN5@chorNOG,0N0I4@fiNOG,0V49Z@meNOG,12N9G@opiNOG,1CPC1@veNOG,COG0258@NOG,KOG2520@euNOG    0N0I4|1.3e-226|755.8    L   Excision repair cross-complementing rodent repair deficiency, complementation group 5
TRINITY_DN97247_c0_g1_i10   7955.ENSDARP00000071650 1e-57   184.2   PREB        map04141    fiNOG[1]    09USQ@biNOG,0DG2A@chorNOG,0MX1Q@fiNOG,0V5S6@meNOG,0XRQK@NOG,12SXV@opiNOG,1CQRT@veNOG,KOG0771@euNOG  0MX1Q|1.7e-59|202.2 U   prolactin regulatory element binding

ADD REPLY • link 7.5 years ago by Farbod ★ 3.4k

1

Entering edit mode

Most probably, many of your nucleotide seqs are not in the correct frame or are not the complete CDS. My suggestion is: translate the set of nucleotide sequences using a dedicated tool, review that the protein products make sense, and upload the protein set to eggnog mapper.

ADD REPLY • link 7.5 years ago by jhc ★ 3.0k

0

Entering edit mode

OK. thank you for your help

Do you have any suggestion for a good "translator tools" ? something that I could input all my transcripts to in and give the protein translation as output (something same as Transdecoder).

~ Best

ADD REPLY • link 7.5 years ago by Farbod ★ 3.4k

0

Entering edit mode

HI,

Which methods would you use for finding orthologs of a particular protein.

Single protein (e.g. P01112) - find orthologs?

I wish to do this offline, with local databases - which resources are best to use here?

Thanks, U.

ADD REPLY • link 6.0 years ago by urema • 0