Question: Functional annotation of SNPs and genes containing nonsynonymous SNPs
gravatar for reza
3.0 years ago by
reza220 wrote:

hi everyone

first question: i have a vcf file resulted from samtools, belonging to a mammalian that assembled in scoffold level. I annotated it and now want to extract non-synonymous SNPs and found genes containing ns SNPs. how i can extract sequences of genes that have ns SNPs to blast them and finding gene names.

second question is: how can i extract indels that located in genic region (in annotated vcf file) and get its length to plot it?

thanks in advance

snp ns snps next-gen indel gene • 1.8k views
ADD COMMENTlink modified 3.0 years ago • written 3.0 years ago by reza220

try VEP and tutorial is here: Once annotated, you can filter the variants (Nonsynonymous) with full annotation. For filtering nonsynonymous variants, follow the tutorial here: Note that VEP can be customized to annotate variants only in coding regions and look at the VEP options for this.

ADD REPLYlink written 3.0 years ago by cpad011213k

thanks for your answer but my under study animal is not in ensembl. i used snpeff to annotation.

ADD REPLYlink written 3.0 years ago by reza220

Try snpsift on snpeff output:

Example code (modified from manual) to filter missense variant :

java -jar SnpSift.jar filter "ANN[*].EFFECT has 'missense_variant'" snpeff_annotated.vcf  >  fitlered_output.vcf

For indel filtering:

java -jar SnpSift.jar filter "(( exists INDEL )" snpeff_annotated.vcf > filtered.vcf
ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by cpad011213k

thanks, it is helpful, i try it with "( EFF[*].EFFECT = 'NON_SYNONYMOUS_CODING' )" and it worked.

ADD REPLYlink written 3.0 years ago by reza220

your suggested way worked for extraction one effect but when i try it for several effect (below command), it did not worked.

java -jar SnpSift.jar filter "( EFF[].EFFECT = 'NON_SYNONYMOUS_CODING' )" & "( EFF[].EFFECT = 'STOP_GAINED' )" & "( EFF[*].EFFECT = 'STOP_LOST )" snpeff_annotated.vcf > fitlered_output.vcf

how can i extract several effects simultaneously?

ADD REPLYlink written 2.9 years ago by reza220

For your second question, bedtools intersect can extract SNPs / indels intersecting you annotation, if you have it on bed or gff format. You can easily get gene length from the bed / gff as well.

ADD REPLYlink written 3.0 years ago by h.mon29k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1802 users visited in the last hour