For bam and mutation signature I need coverage and number of observations
2
1
Entering edit mode
8.1 years ago
kulvait ▴ 270

Hi, I have variant calling pipeline for somatic mutations. I am trying to uncover mutations on low allele balance frequencies.

When I find mutation in particular sample, I would like to see if it is present in other samples. Unfortunately just comparing vcf files does not do the trick because of many possible filtering steps or lack of some calls. Thus for given signature of the mutation from vcf file I would like to obtain info regarding this particular mutation in number of bam files. In particular I am interested the most what is the coverage at the position of the mutation and what is the number of reads supporting the mutation.

For example I have 1.bam, 2.bam, 3.bam. Based on particular criteria (allele balance > 0.1, coverage > 500, allele observations > 10) i produce 1.vcf, 2.vcf, 3.vcf. Now when there is some variant in 1.bam based on these criteria, then in 2.bam there can be this mutation with allele balance 0.09, coverage 1000 and 9 allele observations (and it does not make it to 2.vcf) while in 3.bam there is 0 coverage and no allele observation. I would like to somehow extract this info about signatures of the mutations in 1.vcf in files 2.bam and 3.bam. What tool should I use and is there one?

Thanks, Vojtěch.

next-gen vcf bam • 2.6k views
ADD COMMENT
0
Entering edit mode

Thanks Pierre, at least for SNP detection this is a good solution. Unfortunately it does work only for my unprocessed bam files. After processing them further by picard/GATK there is error

Nov 10, 2016 11:54:30 AM com.github.lindenb.jvarkit.tools.misc.FindAllCoverageAtPosition scan

SEVERE: htsjdk.samtools.SAMFormatException: Invalid GZIP header
htsjdk.samtools.SAMFormatException: Invalid GZIP header
    at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:88)
    at htsjdk.samtools.util.BlockGunzipper.unzipBlock(BlockGunzipper.java:63)
    at htsjdk.samtools.util.BlockCompressedInputStream.inflateBlock(BlockCompressedInputStream.java:410)
    at htsjdk.samtools.util.BlockCompressedInputStream.readBlock(BlockCompressedInputStream.java:392)
    at htsjdk.samtools.util.BlockCompressedInputStream.seek(BlockCompressedInputStream.java:300)
    at htsjdk.samtools.BAMFileReader$BAMFileIndexIterator.getNextRecord(BAMFileReader.java:798)
    at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:656)
    at htsjdk.samtools.BAMFileReader$BAMFileIndexIterator.<init>(BAMFileReader.java:785)
    at htsjdk.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:762)
    at htsjdk.samtools.BAMFileReader.query(BAMFileReader.java:434)
    at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:499)
    at htsjdk.samtools.SamReader$PrimitiveSamReaderToSamReaderAdapter.query(SamReader.java:504)
    at com.github.lindenb.jvarkit.tools.misc.FindAllCoverageAtPosition.scan(FindAllCoverageAtPosition.java:225)
    at com.github.lindenb.jvarkit.tools.misc.FindAllCoverageAtPosition.call(FindAllCoverageAtPosition.java:441)
    at com.github.lindenb.jvarkit.tools.misc.FindAllCoverageAtPosition.call(FindAllCoverageAtPosition.java:62)
    at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExceptions(Command.java:551)
    at com.github.lindenb.jvarkit.util.command.Command.instanceMain(Command.java:608)
    at com.github.lindenb.jvarkit.util.command.Command.instanceMainWithExit(Command.java:614)
    at com.github.lindenb.jvarkit.tools.misc.FindAllCoverageAtPosition.main(FindAllCoverageAtPosition.java:474)

I can send you my bam file since I have no clue what is wrong. When i use samtools view it normally shows the reads.

Kind regards, Vojtech.

ADD REPLY
0
Entering edit mode

Please use `ADD COMMENT to reply to earlier answers, as such this thread remains logically structured and easy to follow.

ADD REPLY
0
Entering edit mode

Well, I have plenty of errors of the type

ERROR: Read name 700834F:16:HAYDCADXX:1:1206:13536:58893, Mate not found for paired read

Problem is that GATK tools did some kind of filtering and removed some reads without their mates. So I probably need to do that manually. Or is there some tool to do this?

ADD REPLY
0
Entering edit mode

my tool would ignore this kind of problem, please run CheckTerminatorBlock too.

ADD REPLY
1
Entering edit mode

Well, thank you I found the problem. There was files together with file.bam in directory

file.bam 
file.bai
file.bam.bai

Index file file.bam.bai contained up to date index while the file.bai contained outdated index. I needed to remove incorrect file.bai and then the program works with correct index file.bam.bai,

thank you Pierre.

ADD REPLY
1
Entering edit mode
8.1 years ago

I've writtern a progam "FindAllCoverageAtPosition" https://github.com/lindenb/jvarkit/wiki/FindAllCoverageAtPosition

It scans a list of bam and produces a summary for a given position.

$ find ./  -type f -name "*.bam" |\
   java -jar dist/findallcoverageatposition.jar -p "chr2:1234"
ADD COMMENT
0
Entering edit mode
7.3 years ago
igvtools count -w 1 --bases --query chr1:100-1000 input.bam output.wig hg38
ADD COMMENT

Login before adding your answer.

Traffic: 1965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6