SNPs density with sliding windows
2
1
Entering edit mode
5.9 years ago
YocelynGG ▴ 70

Hello!!

I have been trying to figure out how can I calculate the number of SNPs using sliding windows? I have a list with three columns: Scaffold"\t"Scaffold_Length"\t"Number_SNPs (per scaffold)

Scaffold_28 70818817    894731
Scaffold_3  5123947 57985
Scaffold_13 4491039 67622
Scaffold_12 3793473 51663
Scaffold_23 3593776 31841
Scaffold_11 3547442 63973
Scaffold_26 2720936 36018
Scaffold_16 2719413 24318
Scaffold_27 1987753 53938
Scaffold_24 1647859 18408
Scaffold_9  1630703 15792
Scaffold_32 1545880 21094

. . .

Based on the second column, I want to use sliding windows (500kbp) and calculate how many SNPs are there into each sliding windows. I performed: bedtools makewindows, but I have not figured out how to count and sum the SNP density.

Thanks a lots for your help!!

snp genome assembly • 3.4k views
ADD COMMENT
2
Entering edit mode
5.9 years ago

One way is to pipe BED-formatted, sliding windows into bedmap, using its --count operator to count the number of SNPs that fall within each window.

The following bedops statement would generate 500knt windows from scaffolds, spaced every 100knt. These windows are passed along to bedmap, which counts the number of SNPs that fall in each of those windows:

$ bedops --chop 500000 --stagger 100000 -x <(awk -vOFS="\t" '{ print $1, $2-1, $2; }' scaffolds.txt | sort-bed -) | bedmap --echo --count --delim '\t' - <(vcf2bed < snps.vcf) > answer.bed

The result is written to answer.bed, each line of which containing the window and the number of SNPs over that window.

ADD COMMENT
3
Entering edit mode
5.9 years ago

bedtools intersect with option -c

-c  For each entry in A, report the number of overlaps with B.
    - Reports 0 for A entries that have no overlap with B.
    - Overlaps restricted by -f and -r.
ADD COMMENT

Login before adding your answer.

Traffic: 1376 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6