Masking sites in a vcf file
0
0
Entering edit mode
5 months ago
peter ▴ 10

I am new to Bioinformatics therefore, I am sorry if my question is too basic but, I am following this article to filter my vcf file from 1000Genomes:

see method section 'Application to human chromosome 18 from the 1000 Genomes CEU sample'

They say in the section that they masked all sites flagged by the 1000 Genomes Project as being unfit for population genetic analyses. 1000Genomes has those sites in bed format and Fasta as well. The link is as follows:

1000Genomes masked sites

My question is how can I use the bed file or Fasta files provided to mask those sites in a vcf file. Is there a tool that does it? Insights will be appreciated.

VCF masker 1000Genomes SNP repeat • 243 views
ADD COMMENT
1
Entering edit mode

I think that you do the opposite i.e. mask a fasta file using a VCF file. You mask a fasta file (usually a reference genome) based on the variant positions in a VCF file so that any downstream analysis software avoid those sites. This should help: https://bedtools.readthedocs.io/en/latest/content/tools/maskfasta.html

ADD REPLY
0
Entering edit mode

Thank you @prasundutta87 for you reply! I came across vcftools that I think I can use. I'll try to use that first.

ADD REPLY

Login before adding your answer.

Traffic: 2072 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6