bcftools view -r seems to be getting sites outside of the region I designate?
1
0
Entering edit mode
3.9 years ago
curious ▴ 750

my_vcf.vcf:

##fileformat=VCFv4.1
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  SMAPLE
chr22   50482719        chr22:50482719:AACACTCGGGCCCCCGAAG:A    AACACTCGGGCCCCCGAAG     A       .       PASS    AF=7e-05     GT    0|0
chr22   50482725        chr22:50482725:CGGGCCCCCGAAGACA:C       CGGGCCCCCGAAGACA        C       .       PASS    AF=7e-05     GT    0|0
chr22   50482740        chr22:50482740:A:C      A       C       .       PASS    AF=7e-05     GT    0|0

bcftools view -r chr22:50482735-50487454 my_vcf.vcf returns all these rows, not just the last one. Is this a bug or am I missing something obvious. I seem to be using the -r switch correctly: -r, --regions chr|chr:pos|chr:beg-end|chr:beg-[,…]

The only other thing that I can think of is maybe bcftools does something internally to adjust the position of indels, if that is the case it really hurts my workflow :(

bcftools vcf bcf • 5.8k views
ADD COMMENT
2
Entering edit mode
3.9 years ago
Ram 43k

You are correct - bcftools returns all positions overlapping the region specified in -r. The first two entries' REF allele overlaps your region by at least one base, so they're returned. What happens when you use -t instead of -r?

ADD COMMENT
1
Entering edit mode

bcftools view -r chr22:50482735-50487454 -t chr22:50482735-50487454 my_vcf.vcf

Looks like I can use them both to avoid this issue and still benefit from tab indexing.

just noticed in docs:

Yet another difference between the two is that -r checks both start and end positions of indels, whereas -t checks start positions only.
ADD REPLY

Login before adding your answer.

Traffic: 3296 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6