Question: Calculating effective coverage/callable genome for variant calling from a bam file
0
gravatar for Rubal
19 months ago by
Rubal340
Germany
Rubal340 wrote:

We have mapped whole-genome sequence data in bam format. We are testing a variety of variant callers to identify mutant sites. The variant callers we are using have various different filters that results in sites being excluded from variant calling. For example some variant callers require >=3 reads to call a variant. Another excluded reads with >3 mismatches to the reference genome.

We would like to know what is the effective coverage that a variant caller sees for variant calling, given that many sites and reads are excluded by such filters. This would enable us to calculate our sensitivity to call variants across the genome and identify regions where no variants can be called.

We think the best way to do this would be a script or tool that reads a bam file and outputs a bigWig file with the effective coverage across the genome based on applying these filters. This could be adapted with new filters depending on the variant caller. Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

Thanks in advance for any help and suggestions.

ADD COMMENTlink modified 19 months ago by Pierre Lindenbaum134k • written 19 months ago by Rubal340
1
gravatar for Pierre Lindenbaum
19 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum134k wrote:

Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

create a first bed file with the coverage depth >X

samtools depth in.bam | awk -F '\t' '{if($3>10) printf("%s\t%d\t%s\n",$1,int($2)-1,$2);}' | bedtools merge > 1.bed

create a second bed file with NM < X

 java -jar dist/samjdk.jar --samoutputformat BAM  -e 'return !record.getReadUnmappedFlag() && (record.getIntegerAttribute("NM")==null || record.getIntegerAttribute("NM") < 10);' src/test/resources/S1.bam | bedtools  bamtobed > 2.bed

combine both bed files with 'bedtools intersect'

ADD COMMENTlink written 19 months ago by Pierre Lindenbaum134k

Thanks very much Pierre, I'll give this a go! For clarificaiton will this only result in a list of regions or will it also give the effective coverage at each site?

ADD REPLYlink modified 19 months ago • written 19 months ago by Rubal340
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2078 users visited in the last hour
_