plink/GCTA: How to remove multiple SNPs and a range around them
1
0
Entering edit mode
17 months ago

Hi!

I have a list of around 200 SNPs that I wan't to remove from my plink file. I also want to remove about 1 MB around each of these SNPs. My problem is: using the --exclude flag only removes the SNPs using --exclude-snp and --window allow me to remove the SNP and the window

I would also be open to using GCTA and the --exclude-region-snp if that can be changed to removing more than on SNP region.

Thanks.

SNP plink genome GCTA • 695 views
1
Entering edit mode
17 months ago

While it may be possible to do this with plink alone, that involves some obscure functionality (--make-set-border, --gene-all, ...). I recommend creating a BED file (https://genome.ucsc.edu/FAQ/FAQformat.html#format1 ) with the positions of your variants, using bedtools or a short script to add 1 MB windows, and then plink --bfile ... --exclude range <windowed BED filename> --make-bed to perform the filtering operation; each step of this workflow is more likely to come up in the future.

Note that BED files are supposed to use 0-based interval coordinates, instead of the 1-based coordinates in VCF and .bim files. However, the difference doesn't really matter when we're talking about 1 MB windows.