Question

bam-readcount does not accept Varscan native format for -l (feature request)

0

Entering edit mode

7.7 years ago

Stephane Plaisance ▴ 460

Dear,

I painfully discovered that he tutorial using bam-readcount in the CurrProtBioiunfo Varscan 2013 paper is not correct and that one cannot use varscan results but needs instead to provide a 1-based pseudo-BED file as -l input.

I find it not very productive as for indels the generation of such file is not straightforward due to the variable length of sequence inserted or deleted (I read the tip in #93418 producing an arbitrary file +/- 2bps but this is not very clean.

Would it be possible to add an extra argument in parallel of -l to load Varscan native data (not VCF) directly and adapt the coordinates on the fly inside bam-readcount to produce the correct footprint?

Thanks in advance Stephane

varscan2 • 1.6k views

ADD COMMENT • link 7.7 years ago by Stephane Plaisance ▴ 460

0

Entering edit mode

# create files with padding of +/-2bps and +/-10bps around the variant positions
gawk 'BEGIN{FS="\t"; OFS="\t"}{print $1, $2-1, $2+2}' varscan_somatic_mpileup_normal-tumor.snp.bed > varscan_somatic_mpileup_normal-tumor.snp.pm2.bed
gawk 'BEGIN{FS="\t"; OFS="\t"}{print $1, $2-9, $2+10}' varscan_somatic_mpileup_normal-tumor.indel.bed > varscan_somatic_mpileup_normal-tumor.indel.pm10.bed

ADD REPLY • link 7.7 years ago by Stephane Plaisance ▴ 460