Question: Calculating effective coverage/callable genome for variant calling from a bam file
0
gravatar for Rubal
7 weeks ago by
Rubal250
Germany
Rubal250 wrote:

We have mapped whole-genome sequence data in bam format. We are testing a variety of variant callers to identify mutant sites. The variant callers we are using have various different filters that results in sites being excluded from variant calling. For example some variant callers require >=3 reads to call a variant. Another excluded reads with >3 mismatches to the reference genome.

We would like to know what is the effective coverage that a variant caller sees for variant calling, given that many sites and reads are excluded by such filters. This would enable us to calculate our sensitivity to call variants across the genome and identify regions where no variants can be called.

We think the best way to do this would be a script or tool that reads a bam file and outputs a bigWig file with the effective coverage across the genome based on applying these filters. This could be adapted with new filters depending on the variant caller. Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

Thanks in advance for any help and suggestions.

ADD COMMENTlink modified 7 weeks ago by Pierre Lindenbaum123k • written 7 weeks ago by Rubal250
1
gravatar for Pierre Lindenbaum
7 weeks ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum123k wrote:

Is there an existing tool that can take a bam file and output such information based on the two filters of minimum read depth >X and number of mismatches in a read < X?

create a first bed file with the coverage depth >X

samtools depth in.bam | awk -F '\t' '{if($3>10) printf("%s\t%d\t%s\n",$1,int($2)-1,$2);}' | bedtools merge > 1.bed

create a second bed file with NM < X

 java -jar dist/samjdk.jar --samoutputformat BAM  -e 'return !record.getReadUnmappedFlag() && (record.getIntegerAttribute("NM")==null || record.getIntegerAttribute("NM") < 10);' src/test/resources/S1.bam | bedtools  bamtobed > 2.bed

combine both bed files with 'bedtools intersect'

ADD COMMENTlink written 7 weeks ago by Pierre Lindenbaum123k

Thanks very much Pierre, I'll give this a go! For clarificaiton will this only result in a list of regions or will it also give the effective coverage at each site?

ADD REPLYlink modified 7 weeks ago • written 7 weeks ago by Rubal250
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1175 users visited in the last hour