Question: How to generate gene-predictions from a published genome?
0
gravatar for niklasjoshua.ebner
2.3 years ago by
Switzerland/Basel/Department of Environmental Sciences
niklasjoshua.ebner0 wrote:

Hi Biostars people,

Some information:

I am in the middle of a proteomics experiment. I got my .raw LC-MS/MS files, my software set-up and everything is ready to go. The genome of the organism I am experimenting with is not yet sequenced, but the genome of a closely-related species is.

My idea is: use a gene-prediction database to search MS/MS data against (e.g. with Mascot). I would like to use gene-predictions of the published genome. Furthermore, redundant entries of 90% similarity would be removed and a common contaminants database would be added (https://maxquant.org/contaminants.zip). The protein sequences I could then compare to data of NCBI nr database using the NCBI Basic Local Alignment Search Tool with e.g. the R package Bio3d etc.

My Question (although very broad and hopefully very obvious to answer): How do I generate a gene-prediction database form publically available genomic data on NCBI?

Have mercy with me, I am new to most of this but I have a rough grasp of the jargon and the concepts.

Thanks for your time

ADD COMMENTlink written 2.3 years ago by niklasjoshua.ebner0
1

First step is to predict the genes: https://en.wikipedia.org/wiki/List_of_gene_prediction_software

After that you would probably post a new question about the output. I used AUGUSTUS one time a while ago, it was easy to use. There is also an online version http://bioinf.uni-greifswald.de/augustus/submission.php

ADD REPLYlink written 2.3 years ago by gb1.9k

Which genome? NCBI genomes in general already have been annotated with protein predictions.

ADD REPLYlink written 2.3 years ago by h.mon31k

This is the genome I am talking about: https://www.ncbi.nlm.nih.gov/genome/17773

ADD REPLYlink written 2.3 years ago by niklasjoshua.ebner0
1

There is an old annotation here: https://i5k.nal.usda.gov/data/Arthropoda/limlun-%28Limnephilus_lunatus%29/Current%20Genome%20Assembly/

The NCBI genome, more recent and with more data, has not been annotated, though.

ADD REPLYlink written 2.3 years ago by h.mon31k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1793 users visited in the last hour