Extract Regions Of Genome That Have 0 Coverage
1
2
Entering edit mode
10.7 years ago
stolarek.ir ▴ 700

Hi all

I am validating indels, that I call myself against 1000 genome vcf.

I am interested by using my bam file and reference sequence in extracting regions that have 0 coverage.

Is there an easy way to do it?

bam samtools coverage • 4.4k views
ADD COMMENT
1
Entering edit mode
10.7 years ago

i just a wrote a picard/java based tool to answer that question. It produces a bed file:

See

https://github.com/lindenb/jvarkit#biostar78285

https://github.com/lindenb/jvarkit/blob/master/src/main/java/com/github/lindenb/jvarkit/tools/biostar/Biostar78285.java

$ java -jar dist/biostar78285.jar I=input.bam VALIDATION_STRINGENCY=LENIENT | head
chr1    300    330
chr1    468    571
chr1    626    726
chr1    776    816
chr1    867    990
chr1    1158    1418
chr1    1521    1538
chr1    11241    11243
chr1    11456    11470
chr1    11621    11623
ADD COMMENT
0
Entering edit mode

if read contains deletions, does your tool falls in the same trap as bed tools, which reports deletion as 0 coverage region?

ADD REPLY
0
Entering edit mode

I've added an option USECIGAR to my code: if set, the program will scan the cigar string and detect the deletions in reference. (slower & requires more memory )

ADD REPLY
1
Entering edit mode

cool thanks. It's nice to be an inspiration for writing new tool. A bot of background behind why I am doing this. I am assessing new data type - moleculo in picking up bigger indels (currently I am at 2000 bp of good quality indels). To assess more or less how good moleculo data is I wanted to see how many of indels from 1000 genome vcf file it can pick. First result was moderate, so i went into finding why. Turned out pretty quickly that the culprit is 20% lack of coverage in my genome (plus heterozygous events, for which I need even more coverage). So I filtered out 0 coverage indels positions from 1000 genome vcf file, and then did the correct assessment

ADD REPLY

Login before adding your answer.

Traffic: 3182 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6