Question: How to filter .vcf based on .gbk file to remove SNP calls in non-CDS regions?
0
gravatar for goatsrunfaster
13 months ago by
goatsrunfaster20 wrote:

I have a VCF file with multiple individuals mapped to a reference. What I would like to do is filter the VCF file so it only includes SNPs from CDS regions. I have a genbank (.gbk) from NCBI for the reference which includes CDS regions. Is there a simple way to do this? I can't seem to find any resources related to this type of filtering.

Additionally, once this filtering is complete I would like to filter synonymous SNPs from the vcf, so I am left with only non-synonymous SNPs in coding regions for my final VCF file.

snp genome • 370 views
ADD COMMENTlink modified 13 months ago by Pierre Lindenbaum133k • written 13 months ago by goatsrunfaster20
4
gravatar for Pierre Lindenbaum
13 months ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

convert genbank to to a snpEff database: http://snpeff.sourceforge.net/SnpEff_manual.html#databases

annotate the vcf with snpEff

filter with snpSift

ADD COMMENTlink written 13 months ago by Pierre Lindenbaum133k

Fantastic, thank you!

ADD REPLYlink written 13 months ago by goatsrunfaster20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1126 users visited in the last hour