Keeping only SNPs
0
1
Entering edit mode
3.9 years ago
safiq713 ▴ 10

Dear all,

I need help. I have filtered SNP in vcf format, I want to extract only informative sites (SNPs) to FASTA format, How can I do that.

Note I am working on ddRAD data. I want to extract only informative sites to make a tree. Do you have any suggestion? Thanks a lot for your kind help.

Kind Regards Safi

SNP • 687 views
ADD COMMENT
0
Entering edit mode

Can you filter the SNPs you want? Extracting the fasta can be done by building a bed file with the desired coordinates (you can use awk for that) and bedtools to extract the fasta regions.

ADD REPLY
0
Entering edit mode

I filtered my snps with vcftools, do you have any example what i can see, thanks a lot for your kind reply

ADD REPLY
1
Entering edit mode
awk '{OFS="\t"; if (!/^#/){print $1,$2-100,$2+100,$1"-"$2"-"$3"-"$4"-"$5}}' $vcf > ${vcf.baseName}.bed
bedtools getfasta -name -fi $genome -bed ${vcf.baseName}.bed > ${vcf.baseName}_seq.fa

Will give you a 100 bp upstream and downstream. Also the name will include the chromosome and coordinates.

ADD REPLY

Login before adding your answer.

Traffic: 2599 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6