Question

How to generate gene-predictions from a published genome?

0

Entering edit mode

5.8 years ago

niklasjoshua.ebner • 0

Hi Biostars people,

Some information:

I am in the middle of a proteomics experiment. I got my .raw LC-MS/MS files, my software set-up and everything is ready to go. The genome of the organism I am experimenting with is not yet sequenced, but the genome of a closely-related species is.

My idea is: use a gene-prediction database to search MS/MS data against (e.g. with Mascot). I would like to use gene-predictions of the published genome. Furthermore, redundant entries of 90% similarity would be removed and a common contaminants database would be added (https://maxquant.org/contaminants.zip). The protein sequences I could then compare to data of NCBI nr database using the NCBI Basic Local Alignment Search Tool with e.g. the R package Bio3d etc.

My Question (although very broad and hopefully very obvious to answer): How do I generate a gene-prediction database form publically available genomic data on NCBI?

Have mercy with me, I am new to most of this but I have a rough grasp of the jargon and the concepts.

Thanks for your time

genome gene prediction proteomics • 1.3k views

ADD COMMENT • link 5.8 years ago by niklasjoshua.ebner • 0

1

Entering edit mode

First step is to predict the genes: https://en.wikipedia.org/wiki/List_of_gene_prediction_software

After that you would probably post a new question about the output. I used AUGUSTUS one time a while ago, it was easy to use. There is also an online version http://bioinf.uni-greifswald.de/augustus/submission.php

ADD REPLY • link 5.8 years ago by gb ★ 2.2k

0

Entering edit mode

Which genome? NCBI genomes in general already have been annotated with protein predictions.

ADD REPLY • link 5.8 years ago by h.mon 35k

0

Entering edit mode

This is the genome I am talking about: https://www.ncbi.nlm.nih.gov/genome/17773

ADD REPLY • link 5.8 years ago by niklasjoshua.ebner • 0

1

Entering edit mode

There is an old annotation here: https://i5k.nal.usda.gov/data/Arthropoda/limlun-%28Limnephilus_lunatus%29/Current%20Genome%20Assembly/

The NCBI genome, more recent and with more data, has not been annotated, though.

ADD REPLY • link 5.8 years ago by h.mon 35k