Question: vcf files: counting number of variants in genomic windows of chosen size
0
gravatar for spiral01
2.6 years ago by
spiral01100
spiral01100 wrote:

Is there a tool to count the number of variants in each genomic window of user-designated size? Something that would work along the lines of vcftools --TajimaD which takes as argument the size of the window you would like and then calculates Tajima's D in each window. I would like to simply count the number of variants in each window.

snp • 1.6k views
ADD COMMENTlink modified 2.6 years ago by Alex Reynolds30k • written 2.6 years ago by spiral01100
2
gravatar for harold.smith.tarheel
2.6 years ago by
United States
harold.smith.tarheel4.5k wrote:

See Plotting SNP density along a chromosome from VCF files

ADD COMMENTlink written 2.6 years ago by harold.smith.tarheel4.5k

Apologies for the repeat question. That is exactly what i was after. Thank you.

ADD REPLYlink written 2.6 years ago by spiral01100
0
gravatar for Alex Reynolds
2.6 years ago by
Alex Reynolds30k
Seattle, WA USA
Alex Reynolds30k wrote:

Via BEDOPS:

$ bedmap --echo --count genes.bed <(vcf2bed < variants.vcf) > answer.bed

If your genes are in another format, say GFF:

$ bedmap --echo --count <(gff2bed < genes.gff) <(vcf2bed < variants.vcf) > answer.bed

If you have generic windows, replace genes.bed with a windows.bed of your design.

ADD COMMENTlink modified 2.6 years ago • written 2.6 years ago by Alex Reynolds30k

Hi, I am getting segmentation fault: 11 when using the first bedmap command as such:

bedmap --echo --count windows.bed <(vcf2bed < chr21.vcf.gz) > chr21.coverage.txt

The final output keeps giving me a count of 0 for each window. I'm not sure how to interpret this?

ADD REPLYlink written 2.6 years ago by spiral01100

The file chr21.vcf.gz is not a VCF file, but is instead a gzip-compressed binary. Extract it and then pipe the extracted data to vcf2bed, e.g.:

$ bedmap --echo --count --delim '\t' windows.bed <(gunzip -c chr21.vcf.gz | vcf2bed -) > windows_with_counts_of_variants.bed

Interpretation: If some of your windows are not on chr21, and all the variants in chr21.vcf are from chr21, then expect zero-counts over those windows which are not on that chromosome.

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by Alex Reynolds30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 747 users visited in the last hour