Question: Masking Variable Sites in a Fasta File
gravatar for Jautis
4.0 years ago by
United States
Jautis270 wrote:

Hi, I have a fasta file representing a reference genome and I would like to modify it to mask variable sites when I map variable sites. I'm interested in doing this because I have bisulfite reads from several related species, but BSmap and Bismark don't offer an option to mask variable sites while mapping. 

The initial genome is in a fasta file. The sites I would like masked in a vcf file. 


Thank you!

variable vcf fasta • 1.3k views
ADD COMMENTlink modified 4.0 years ago by Matt Shirley9.0k • written 4.0 years ago by Jautis270
gravatar for Matt Shirley
4.0 years ago by
Matt Shirley9.0k
Cambridge, MA
Matt Shirley9.0k wrote:

If you can convert your VCF to BED format (see you can use the pyfaidx "faidx" command to mask your FASTA file with a special character, or as lowercase letters:

vcf2bed < variable_sites.vcf | faidx genome.fasta --bed - -m

Note that the "-m" and "-M" options will modify your FASTA file in-place, so you probably want to make a copy first.

ADD COMMENTlink written 4.0 years ago by Matt Shirley9.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 605 users visited in the last hour