Question

Mapping Gene Positions onto a Genome

0

Entering edit mode

5.1 years ago

Morgan S. ▴ 80

Hello,

I am having a hard time understanding how I can turn my gene predictions and annotations into a visual. In R, using the Circlize package, I created this figure for my genome

Now I want to add gene predictions from eggNOG/Cogs into my genome figure like in these examples below: Orsi et al., 2015

and Ku et al., 2013

The annotation output does not have any information regarding the gene start and end locations for me to be able to map them? So how are these authors doing it?

Thanks for your help!

genome annotation R • 1.3k views

ADD COMMENT • link 5.1 years ago by Morgan S. ▴ 80

0

Entering edit mode

Did you not do gene predictions as an independent step before trying COG analysis? In the paper you linked the authors did

The complete genome sequences were processed usingRNAmmer (Lagesen et al. 2007), tRNAscan-SE (Lowe andEddy 1997), and PRODIGAL (Hyattetal.2010) for gene pre-diction. The gene name and description for the protein-codinggenes were assigned based on the orthologous genes identi-fied by OrthoMCL

ADD REPLY • link 5.1 years ago by GenoMax 141k

0

Entering edit mode

I did, but it was different fro that paper which was based on metatranscriptomics. Mine are based on whole genomes. I used MAKER for gene predictions. I uploaded the protein fasta file from the MAKER output into the eggNOG mapper

ADD REPLY • link 5.1 years ago by Morgan S. ▴ 80

0

Entering edit mode

What kind of genome are you working with? Prokaryotic or eukaryotic?

ADD REPLY • link 5.1 years ago by GenoMax 141k

0

Entering edit mode

Here is what the output looks like

eggNOG

ADD REPLY • link 5.1 years ago by Morgan S. ▴ 80

0

Entering edit mode

Is the query file multi-fasta? Individual fasta? You still need to tell us if your genome is Pro- or Eukaryotic. For prokaryotic genomes you should use Prokka instead of MAKER.

ADD REPLY • link 5.1 years ago by GenoMax 141k

0

Entering edit mode

Sorry, I thought I replied, but it is eukaryotic, fungal to be exact. The query file is a multi-fasta file. Like this:

 >1371E_00011828-RA protein AED:0.07 eAED:0.07 QI:0|0|0|0.75|1|1|4|0|184
MLIYTDIVSGDEIVADTFNLVPNKDFDILWECDCRKYLKRSNEDFQLEGANPSAEDAEDD
GGEGEATMVHDIEDQFRLVWLKVEDGAKPSKENYKGHIKSYLKKLHKNASPKFAEATDPA
EAEKVWKTKAAGAMKKILANWDNYDVLMGQSMDGDAMHVLIDFREDGVTPYATVWADGLK
EIKV
 >1371E_00011814-RA protein AED:0.05 eAED:0.08 QI:0|0|0|1|1|1|4|0|339
MSASLPGSRDLPPSQYDLKTYWGRVRHAADISDPRTLFVSSTGLESAKSLIASYKQNRIP
GITPELWSAKKVVDATLHPDTGTPVFLPFRMSCYVLTNLVVTAGMLTPGLQTTGTLLWQI
GNQSLNVAVNNANANKSTPLSLSQIGKSYLMAVSAS....etc.

ADD REPLY • link updated 5.1 years ago by GenoMax 141k • written 5.1 years ago by Morgan S. ▴ 80