Question: bam file calculate sliding window insert sizes
1
gravatar for 14134125465346445
2.6 years ago by
United Kingdom
141341254653464453.5k wrote:

Hi all,

I have a bam file with paired-end reads of relatively uneven coverage, and I want to produce a wiggle plot with the sliding window of insert sizes. Is there a tool for this?

bam insert size • 943 views
ADD COMMENTlink modified 2.6 years ago by Devon Ryan93k • written 2.6 years ago by 141341254653464453.5k

In the somewhat likely event that there's nothing prewritten to do this, I expect something can be thrown together with the deepTools python API (I can try to throw something together if you'd like).

ADD REPLYlink written 2.6 years ago by Devon Ryan93k

That would be great!

ADD REPLYlink written 2.6 years ago by 141341254653464453.5k
2
gravatar for Pierre Lindenbaum
2.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum124k wrote:

generate a bed of insert size from your bam

samtools view -f 66 in.bam | awk -F '\t' '{printf("%s\t%d\t%s\t%s\n",$3,(int($4)<int($8)?int($4):int($8))-1,int($4)<int($8)?$8:$4,int($9)<0?-$9:$9);}' | LC_ALL=C sort -t$'\t' -k1,1 -k2,2n > f1.bed

create sliding windows from your reference

$  bedtools  makewindows -g ref.fasta.fai  -w 10000000 -s 5000000  | LC_ALL=C sort -t$'\t' -k1,1 -k2,2n > f2.bed

compute the intersection between the two bed file, cut window-chr/window-start/window-end/segment-length and group by the mean of the segment-length:

bedtools intersect -wa -wb -a f2.bed -b f1.bed | cut -f 1,2,3,7 |   bedtools groupby -i - -g 1,2,3 -c 4  -o mean > f3.bed

f3 is a bedgraph that you can convert to a wiggle file using the UCSC tools.

ADD COMMENTlink written 2.6 years ago by Pierre Lindenbaum124k
2
gravatar for Devon Ryan
2.6 years ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

Pierre's answer may end up being easier, but since I was a bit curious how easy/hard this would be to do with the deepTools API (it was very non-trivial, but it'll be quicker than with samtools):

Change the input and output names or add in argparse if you really want to reuse it. The code contains a lot of copy/paste from other deepTools functions, which is why it's a bit of a mess.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Devon Ryan93k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1665 users visited in the last hour