Question: Gene Annotation Using Known Proteins Or Est Sequences
gravatar for Plantae
8.0 years ago by
Plantae380 wrote:

Hi, we have sequenced a new genome, now i want to annotate the genome using know proteins and ESTs from other species. Proteins or ESTs were blast against to our genome, but some of them got too many hits. My question is that - should i filter these blast hits by e-values? If so, how to reasonably setting these parameters?

gene annotation genome • 1.9k views
ADD COMMENTlink written 8.0 years ago by Plantae380
gravatar for Darked89
8.0 years ago by
Barcelona, Spain
Darked894.2k wrote:

You may start with filtering out plant protein entries containing repetitive sequences. Check Pfam database for "plant transposon" i.e.:

GMAP may be better than blast for ESTs mapping (it is splice site aware).

For same species ESTs used for gene mapping try to use PASA.

ADD COMMENTlink written 8.0 years ago by Darked894.2k
gravatar for Travis
8.0 years ago by
Travis2.8k wrote:

As mentioned above, you are probably better using a program that specifically accounts for the presence of introns. Exonerate is a good one and freely available.

When using ESTs I would recommend getting rid of anything with less than 90% identity.

Also, divide the length of your alignment by the length of the original sequence to get a % aligned value. Then remove anything with less than e.g. 90% of its length aligned.

Adjust the 90% if you seem to get too few hits.

ADD COMMENTlink written 8.0 years ago by Travis2.8k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 990 users visited in the last hour