Extracting specific SNPs from VCF
1
0
Entering edit mode
4.6 years ago
igor • 0

Hi,

How should I extract specific SNPs from VCF file (30x). What is the standard way of doing this? Which tool/s should I use?

I have file with the snp ids (ie. rs964284,...) I would like to extract and the 30x vcf which is not annotated.

Right now I would imputate the vcf (beagle), annotate it (variant annotator) and then extract the SNPs.

Is there a tool that does it all in one go?

Thanks.

vcf gatk SNP • 1.3k views
ADD COMMENT
0
Entering edit mode
4.6 years ago

Is there a tool that does it all in one go?

Not really, but you can develop one now, if you want? If anything comes close, then it's the GATK due to the fact that it has a BeagleCodec, but you would still need to run Beagle first.

For imputation, there is also IMPUTE2 or SHAPEIT2+IMPUTE2. For annotation, you can use bcftools, SnpEff, or, indeed, GATK's VariantAnnotator. For extracting the SNPs, use bcftools filter, bcftools view, GATK, or just your own shell scripts (awk works well), or even Python.

Kevin

ADD COMMENT
0
Entering edit mode

Kevin thanks for the reply. I could write such app. How should I think about imputation and coverage? Should I do imputation on 30x vcf, or whatever is not in 30x vcf equals the reference? Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 1993 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6