What exactly is included in the mRNA reference set (GRCh38/hg38).
1
0
Entering edit mode
8.0 years ago

Hi all,

I mapped my reads to the human transcriptome that I downloaded the from ucsc: https://genome.ucsc.edu/ > Downloads > Genome Data > Human > Full data set > mrna.fa.gz

This is the description of the file on the website: “mrna.fa.gz - Human mRNA from GenBank. This sequence data is updated once a week via automatic GenBank updates.”

As I understand this, the mRNA is extracted, reverse complimented to cDNA, sequenced and then successfully mapped sequences are stored in this mrna.ref file.

What I would like to know is the following: Is ribosomal rna in this ref? Is long non conding rna in this ref? What kind of RNA does mRNA.ref not cover? Is mrna.ref not the transcriptome reference? because ribosomal RNA is in the transcriptome but not the mRNA, right? If mRNA.ref is not the most accurate reference to use, do you guys know what is?

Thank you, -Bjarki

RNA-Seq next-gen • 2.9k views
ADD COMMENT
0
Entering edit mode
8.0 years ago

Do you have a good reason to map to the transcriptome? More commonly RNA-seq is mapped to the entire genome using a spliced read aligner.

ADD COMMENT
0
Entering edit mode

Yes. I'm interested in the reads that do not get mapped to the transcriptome. So I extract the unmapped reads from the resulting bam file after mapping to the transcriptome.

ADD REPLY

Login before adding your answer.

Traffic: 1558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6