Question: Keeping only SNPs
1
gravatar for safiq713
3 months ago by
safiq7130
safiq7130 wrote:

Dear all,

I need help. I have filtered SNP in vcf format, I want to extract only informative sites (SNPs) to FASTA format, How can I do that.

Note I am working on ddRAD data. I want to extract only informative sites to make a tree. Do you have any suggestion? Thanks a lot for your kind help.

Kind Regards Safi

snp • 127 views
ADD COMMENTlink modified 3 months ago • written 3 months ago by safiq7130

Can you filter the SNPs you want? Extracting the fasta can be done by building a bed file with the desired coordinates (you can use awk for that) and bedtools to extract the fasta regions.

ADD REPLYlink written 3 months ago by Asaf8.4k

I filtered my snps with vcftools, do you have any example what i can see, thanks a lot for your kind reply

ADD REPLYlink written 3 months ago by safiq7130
1
awk '{OFS="\t"; if (!/^#/){print $1,$2-100,$2+100,$1"-"$2"-"$3"-"$4"-"$5}}' $vcf > ${vcf.baseName}.bed
bedtools getfasta -name -fi $genome -bed ${vcf.baseName}.bed > ${vcf.baseName}_seq.fa

Will give you a 100 bp upstream and downstream. Also the name will include the chromosome and coordinates.

ADD REPLYlink written 3 months ago by Asaf8.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 769 users visited in the last hour