Assembled Human Ests For Annotation
1
1
Entering edit mode
12.4 years ago

I am annotating a region of the human genome and am looking for a set of assembled ESTs I can use as evidence for constructing gene models. A quick search on the SRA shows 4,318 human RNA-seq data sets, so I understand there is no shortage of data. I am hoping, however, to save the time that would be required to search through this massive amount of data, select an appropriate subset, and assemble the ESTs myself. I'm sure this has been done many times before, and for this particular task it doesn't make sense to repeat this process.

Is there any sort of general (i.e. not tissue-specific) reference EST/transcriptome assembly available for Homo sapiens?

UPDATE I expanded my search to the UCSC genome browser and found this page. Three files caught my eye immediately: est.fa.gz, mrna.fa.gz, and refMrna.fa.gz. I'm guessing that

  • est.fa.gz represents raw, unassembled EST sequences
  • mrna.fa.gz represents a comprehensive, redundant set of assembled transcripts
  • refMrna.fa.gz represents a non-redundant set of assembled transcripts

Is this correct?

gene human est • 2.3k views
ADD COMMENT
1
Entering edit mode
12.4 years ago

Daniel,

You may find the Unigene data at NCBI useful for assembled ESTs. I am not absolutely certain that the two mRNA datasets you listed are redundant and non-redundant, respectively. I don't think it matters for your purpose - meaning that you could use both to annotate the genomic region you have.

ADD COMMENT

Login before adding your answer.

Traffic: 3873 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6