non-exonic reads in RNAseq
2
0
Entering edit mode
7.9 years ago
kanwarjag ★ 1.2k

I understand in RNAseq we will always get some level of reads mapping to areas . However I am seeing almost 85% of reads align to outside coding region. i was wondering what may be the cause of this. Does that mean a contamination has over- powered actual data.

RNA-Seq intergenic • 2.2k views
ADD COMMENT
3
Entering edit mode

Are you using the correct reference? ;-)

It sounds like a wet lab problem, are you sure you have removed DNA from your samples?

ADD REPLY
0
Entering edit mode

yes correct genome for aligning. I am looking and the distribution of reads in Bam file

ADD REPLY
2
Entering edit mode

Almost every time this happens it's because someone's using the wrong genome in IGV :)

ADD REPLY
1
Entering edit mode

you define as "intergenic" everything that couldn't be assigned to a gene? what about intronic regions? this could eventually be also caused by looking at the wrong strand, or using a different genome version for the annotation and the mapping. did you already check these cases?

ADD REPLY
0
Entering edit mode

Is your genome of interest/specie well characterized?

ADD REPLY
0
Entering edit mode

mm9 that is well characterized

ADD REPLY
1
Entering edit mode

Are these ribo-depleted samples or polyA-enriched? If the former, then perhaps you're seeing expressed repeat regions (there can be quite a few).

ADD REPLY
0
Entering edit mode

ribo depleted. I think it is sample preparation but I want to make sure

ADD REPLY
1
Entering edit mode

Try aligning against the Rn45S sequence and see how many reads you get. I routinely do that with our rRNA depleted datasets to see how depleted they actually are. My guess is that Michael is right you're getting a bunch of rRNA (and probably tRNA). Just look at some of the higher-coverage areas on the UCSC genome browser with the repeatmasker track enabled. I suspect that'll be illuminating.

ADD REPLY
0
Entering edit mode

Does anyone know why this was deleted?

ADD REPLY
0
Entering edit mode

Nope, can we restore it?

ADD REPLY
0
Entering edit mode

I opened it but I guess the OP deleted this,

ADD REPLY
2
Entering edit mode
7.9 years ago
Michael 54k

Hi Kanwarjag,

However I am seeing almost 85% of reads align to intergenic region.

This is imo extremely unlikely and I have never seen that, even with our imperfect annotation of our model the salmon louse. I have done a little randomization experiment on our data by placing random gene models in intergenic regions, and found that for most samples the 99% confidence level background read-count to experience in intergenic regions is 1. I don't have exact figures for reads overlapping intergenic regions though, but 85% is very high. I would check the following

  • correct annotation version
  • missing ribosomal genes from the annotation and high level of rRNA
  • draft annotation with a large number of truncated gene models and missing or truncated UTR's (add some kb flanks to genes and check again)
  • leak DNA
ADD COMMENT
0
Entering edit mode
7.9 years ago
Michele Busby ★ 2.2k

Try running it through RNA Seq QC. It take a while to set up but once you do you have it. It will tell you if it's rRNA or intergenic vs intronic.

ADD COMMENT

Login before adding your answer.

Traffic: 2676 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6