Question: Insert size in RNA-Seq library
gravatar for elsoja
4.1 years ago by
elsoja110 wrote:

I'm aligning some mouse RNA-Seq libraries against the genome using STAR. After that, I started to analyse the insert size with Picard. I noticed two types of histogram.

The one showed in Histogram A have a single peak, around 180 bp. The type represented in Histogram B, however, have multiple peaks.

As the aligment was done against the genome, I expected that all libraries would have displayed a pattern like the one in Histogram B, as the spliced inserts would produce "artificial" inserts sizes. So, why did I got these two kinds of histograms?

Thanks a lot!

Histogram A:

Histogram A

Histogram B:

Histogram B

rna-seq • 2.1k views
ADD COMMENTlink modified 2.7 years ago by SMILE120 • written 4.1 years ago by elsoja110

Are all of the samples that produce histrogram like that in (A) also total RNAseq rather than mRNAseq? The GEO entry doesn't mention any sort of ribo-depletion.

ADD REPLYlink written 4.1 years ago by Devon Ryan95k

Two libraries produce histograms like B, they were both produced by the same team. The remaining libraries (3) have histograms like the A.

I suppose that they're all poly-A positive, but this information is not explicit in all libraries.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by elsoja110

Do you get the reason of Histogram B? How do you solve the problem? I have a similar problem...

ADD REPLYlink written 2.7 years ago by SMILE120

Mapping against the transcriptome should give a better representation of the actual insert-size distribution. Also, if your reads are paired and mostly overlapping, you can get an alignment-free insert-size distribution with BBMerge: in1=read1.fq in2=read2.fq reads=1m ihist=ihist.txt

Or, in fact, you can get that even if they are NOT mostly overlapping, using kmer frequencies to bridge the gap in the middle (this is slower and uses more memory, and requires sufficient depth): in1=read1.fq in2=read2.fq ihist=ihist.txt k=62 extend2=200 rem ecct
ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1665 users visited in the last hour