Question: How to calculate the Average Insert Size after mapping the reads to the reference genome using BWA
0
gravatar for felipe_o_torquato
2.7 years ago by
felipe_o_torquato0 wrote:

Hi,

Having mapped the reads to the reference genome using BWA, I am trying to calculate their Average Insert Size.

Thereafter, I converted the BAM file to SAM file in order to check the ISIZE values. Since these values are both positive and negative, when I calculate the average value (using Summary Statistics) the result is always zero.

Does anyone know any way to deal with these negative values?

many thanks in advance

alignment • 4.1k views
ADD COMMENTlink modified 2.7 years ago by Brian Bushnell16k • written 2.7 years ago by felipe_o_torquato0

What kind of library do you have? Do you expect negative values?

ADD REPLYlink written 2.7 years ago by h.mon25k
3
gravatar for igor
2.7 years ago by
igor7.6k
United States
igor7.6k wrote:

What about using Picard CollectInsertSizeMetrics? https://broadinstitute.github.io/picard/command-line-overview.html#CollectInsertSizeMetrics

Or do just want to want understand the frag length (TLEN) values in the SAM file? Then, you should expect to see a positive value for one read and a negative value for its mate. See SAM format specs: http://samtools.github.io/hts-specs/SAMv1.pdf

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by igor7.6k
1
gravatar for Brian Bushnell
2.7 years ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

Assuming your reads are in the original order, you can calculate the insert size distribution with the BBMap package like this:

reformat.sh in=mapped.sam ihist=ihist.txt

That will also print the average and standard deviation in the header.

ADD COMMENTlink written 2.7 years ago by Brian Bushnell16k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1788 users visited in the last hour