Question: What exactly is included in the mRNA reference set (GRCh38/hg38).
gravatar for bjarki.sigurjons
3.2 years ago by
bjarki.sigurjons10 wrote:

Hi all,

I mapped my reads to the human transcriptome that I downloaded the from ucsc: > Downloads > Genome Data > Human > Full data set > mrna.fa.gz

This is the description of the file on the website: “mrna.fa.gz - Human mRNA from GenBank. This sequence data is updated once a week via automatic GenBank updates.”

As I understand this, the mRNA is extracted, reverse complimented to cDNA, sequenced and then successfully mapped sequences are stored in this mrna.ref file.

What I would like to know is the following: Is ribosomal rna in this ref? Is long non conding rna in this ref? What kind of RNA does mRNA.ref not cover? Is mrna.ref not the transcriptome reference? because ribosomal RNA is in the transcriptome but not the mRNA, right? If mRNA.ref is not the most accurate reference to use, do you guys know what is?

Thank you, -Bjarki

rna-seq next-gen • 1.4k views
ADD COMMENTlink modified 3.2 years ago by WouterDeCoster39k • written 3.2 years ago by bjarki.sigurjons10
gravatar for WouterDeCoster
3.2 years ago by
WouterDeCoster39k wrote:

Do you have a good reason to map to the transcriptome? More commonly RNA-seq is mapped to the entire genome using a spliced read aligner.

ADD COMMENTlink written 3.2 years ago by WouterDeCoster39k

Yes. I'm interested in the reads that do not get mapped to the transcriptome. So I extract the unmapped reads from the resulting bam file after mapping to the transcriptome.

ADD REPLYlink written 3.2 years ago by bjarki.sigurjons10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2137 users visited in the last hour