Question: RNA-seq GC content Bimodal distribution
3
gravatar for ruansun1983
4.9 years ago by
ruansun198330
United States
ruansun198330 wrote:

Is it normal for RNA-seq from human to have bimodal distribution in GC content?

For DNA seq bimodal distribution in GC content is a sign of contamination.

How about RNA-seq?

http://postimg.org/image/nrlw7wgdf/

rna-seq • 3.5k views
ADD COMMENTlink modified 19 months ago by from the mountains40 • written 4.9 years ago by ruansun198330

Could you please post the figure?

ADD REPLYlink written 4.9 years ago by Biomonika (Noolean)3.0k

I don't know how to post image. But added a link to the picture

ADD REPLYlink written 4.9 years ago by ruansun198330
1
gravatar for David Fredman
4.8 years ago by
David Fredman980
University of Bergen, Norway
David Fredman980 wrote:

That's rather extreme GC content (around 90%), and such a secondary peak is not expected in an RNA-seq run for a specific species. What is expected is some noise at the start of the per-base nucleotide distribution due to not-so-random hexamers.

A very specific contaminant or PCR amplicon would be visible as a sharp peak, and also also be found as an overrepresented sequence. If quite sharp and close to the main distribution it could represent read through into adapters. None of these two scenarios fit here.

In your case, if you want to find out what caused this, check some of your sequences are runs of GpC or a triplet repeat. Are any such sequences visible as an overrepresented sequence?

It might also be contamination from a different species, although I can't think of anything with an overall GC content that high. You could try fastq_screen to check for other species you work with regularly (or e.g. a food source). Alt fish out some of the high GC sequences and Blast them to see what they are.

 

 

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by David Fredman980
0
gravatar for from the mountains
19 months ago by
United States
from the mountains40 wrote:

i see that second peak all the time with rRNA depleted libraries. rRNA depletion methods are never complete, which is reflected by the lower density peak at ~85%. You want that peak to be as low as possible.

ADD COMMENTlink written 19 months ago by from the mountains40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1095 users visited in the last hour