Annotating the two flanking bases of SNPs in a VCF file (perhaps SnpEff?)
1
0
Entering edit mode
7.0 years ago
lm687 ▴ 50

Hi, I was wondering what tool I could use to annotate a vcf and include the two flanking bases, upstream and downstream of each SNP. Ideally I would like to get a count table such as the one from SnpEff when the csvStats option is used, but including three nucleotides instead of just one. At first I thought SnpEff's --upDownStreamLen argument was it, but as far as I'm concerned this is just to define the 'upstream/downstream effect' zone. Thanks in advance.

snp vcf snpeff annotation • 2.3k views
ADD COMMENT
1
Entering edit mode
7.0 years ago

I've quickly written something: http://lindenb.github.io/jvarkit/Biostar251649.html

$ java -jar dist/biostar251649.jar -n 10 -R tests/ref.fa tests/mutations.vcf
##INFO=<ID=SEQ3_10,Number=1,Type=String,Description="Sequence on the 3' of mutation">
##INFO=<ID=SEQ5_10,Number=1,Type=String,Description="Sequence on the 5' of mutation">
(...)
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  S1  S2  S3  S4
rotavirus   51  .   A   G   22.55   .   AC1=2;AF1=0.25;BQB=1;DP=944;DP4=849,0,93,0;FQ=23.7972;G3=0.75,0,0.25;HWE=0.033921;MQ=60;MQ0F=0;MQB=1;PV4=1,1,1,1;RPB=0.993129;SEQ3_10=GATGGTAAGC;SEQ5_10=TCTACTCAGC;SGB=-61.9012;VDB=3.53678e-05    GT:PL   0/0:0,255,134   0/0:0,255,127   0/0:0,255,137   1/1:70,255,0
rotavirus   91  .   A   T   5.45    .   AC1=1;AF1=0.124963;BQB=0.951201;DP=1359;DP4=1134,0,225,0;FQ=5.8713;MQ=60;MQ0F=0;MQB=1;PV4=1,4.80825e-05,1,1;RPB=0.0393173;SEQ3_10=GTTGTTGCTG;SEQ5_10=TTGAAGCTGC;SGB=-369.163;VDB=0.313337   GT:PL   0/0:0,255,133   0/1:40,0,31 0/0:0,255,134   0/0:0,255,82
ADD COMMENT
0
Entering edit mode

Thank you so much, I will use it now! My solution was to do

b=`grep -v '#'  SAMPLE.vcf | awk '{print $1":"($2 - 1)"-"($2+1)}'`; for i in $b; do samtools faidx genome.fa $i; done

(the potential problem of which being when two SNPs appear consecutively)

ADD REPLY

Login before adding your answer.

Traffic: 1852 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6