Question

What Steps Do You Take To Re-Annotate Sequences?

1

Entering edit mode

12.6 years ago

Matt ▴ 70

What steps do you take to re-annotate sequences to find out the gene or marker?

Give an input file of 1 to N rows of just identifier (unique) and sequence (As,Cs,Gs,Ts). What are the best steps to annotate?

Blast? Which Blast?
Filter results to unique matches, multiple matches, and no matches? How do you handle multiple matches?
Find coordinates and "match" to NCBI, Ensemble, others?

Any tools or high level workflow that you could share would be GREATLY appreciated.

sequence annotation gene gene function • 2.0k views

ADD COMMENT • link updated 12.6 years ago by Larry_Parnell 16k • written 12.6 years ago by Matt ▴ 70

0

Entering edit mode

If these are something like microarray probes, you might consider answers to this question:

A: Pipeline To Map 60-Mers To Genes

ADD REPLY • link updated 4.4 years ago by Ram 43k • written 12.6 years ago by Sean Davis 26k

score 1 · Answer 1 · 2011-09-20

What I will do in this space is give you the types of data that are at the high level hierarchy of my human genome database. These data types will likely be applicable to most eukaryotic gene annotation efforts. Prokaryote gene annotation is not something with which I have much experience and so should not offer suggestions, other than to say I think the environment in which the organism normally lives and was isolated is important. I have a lot of gene expression and protein expression (proteomics) data that I use to ascertain function or candidacy for further experiments in our lab, but I rely on all types of data to make that call.

As I work with human data, I am not so concerned with mapping sequence by BLAST. I simply note the genome build and the associated gene coordinates, ignoring exon coords (for now).

[?]

Lastly, I have a free-text "knowledge" field where I enter info from lab meeting, abstracts, etc. (typically with reference).

In addition, I have a metabolite database where I link small molecule to gene.

That list should give you some good ideas on what to collect in order to more confidently describe the potential function of a gene and its encoded protein.