VEP output is only protein_coding
2
0
Entering edit mode
3.2 years ago
storm1907 ▴ 30

Hello, I am supposed to extract both protein coding and synonymous variants from VCFs that were given to me. Only variant consequence i find here is "Protein_coding", but no strings as "synonymous" are present there. Is that some error with VEP?

Thank you!

VEP • 1.0k views
ADD COMMENT
0
Entering edit mode
3.2 years ago

Hi, Protein_coding is not exactly a consequence - it just provides information about the location of the variant with respect to a given transcript isoform.

I think that you may want to think about the definition of 'synonymous':

  • synonymous base substitution (synonymous variant / synonymous mutation): a base change in a protein coding region that does not alter the resulting amino acid sequence
  • non-synonymous base substitution (non-synonymous variant / non-synonymous mutation): a base change in a protein coding region that does [yes] alter the resulting amino acid sequence

If you wish, please show the command that you used to annotate the variants, and also show a sample of the output that was produced.

Kevin

ADD COMMENT
0
Entering edit mode
3.2 years ago
storm1907 ▴ 30
chr2    60553286    .   G   GGGC    46.6    PASS    CSQ=||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000335712|protein_coding|1/3||372-373||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000356842|protein_coding|1/5||285-286||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000358510|protein_coding|1/4||123-124||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000359629|protein_coding|1/5||348-349||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000409351|protein_coding|1/3||214-215||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000642384|protein_coding|1/4||368-369||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000642439|protein_coding|1/4||272-273||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000643004|protein_coding|1/3||175-176||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000643716|protein_coding|1/2||346-347||||,||||||||||||MODIFIER|BCL11A|ENSG00000119866|ENST00000646249|protein_coding|2/5||697-698|||| GT:GQ:DP:AD:VAF:PL  0/1:29:103:23,75:0.728155:46,0,29
chr2    60768981    .   A   T   67.5    PASS    CSQ=|FAIL|0.00|0.00|0.00|0.00|3|13|-26|15|||MODIFIER|PAPOLG|ENSG00000115421|ENST00000238714|protein_coding||5/21|||||   GT:GQ:DP:AD:VAF:PL  0/1:67:13:6,7:0.538462:65,0,99
chr2    60780647    .   A   G   59.5    PASS    CSQ=|FAIL|0.00|0.00|0.00|0.00|1|37|0|-25|||MODIFIER|PAPOLG|ENSG00000115421|ENST00000238714|protein_coding||9/21|||||    GT:GQ:DP:AD:VAF:PL  1/1:58:32:0,32:1:59,63,0
chr2    60780851    .   T   C   60.5    PASS    CSQ=|FAIL|0.00|0.00|0.00|0.00|8|47|-8|-3|||MODIFIER|PAPOLG|ENSG00000115421|ENST00000238714|protein_coding||10/21|||||   GT:GQ:DP:AD:VAF:PL  

and command line:

command_line
--plugin Mastermind,/opt/vep/.vep/source_1.gz --plugin SpliceAI,snv=/opt/vep/.vep/source_2.gz,indel=/opt/vep/.vep/source_3.gz,cutoff=0.4 --verbose --no_stats --force --allow_non_variant --gencode_basic --offline --dont_skip --distance 100 --vcf --compress_output gzip --fork 72 --fields 'Allele,SpliceAI_cutoff,SpliceAI_pred_DS_AG,SpliceAI_pred_DS_AL,SpliceAI_pred_DS_DG,SpliceAI_pred_DS_DL,SpliceAI_pred_DP_AG,SpliceAI_pred_DP_AL,SpliceAI_pred_DP_DG,SpliceAI_pred_DP_DL,Mastermind_MMID3,Mastermind_counts,IMPACT,SYMBOL,Gene,Feature,BIOTYPE,EXON,INTRON,cDNA_position,Protein_position,Amino_acids,Codons,STRAND'

When I analyze this kind of file in Illumina Variant Interpreter, I get information from that cloud about synonymous variants too. But I am not able to find anything relating synonymous in this vcf. Also I dont get why for some variants CSQ field is duplicated. I need to extract some columns, but their count is not even equal.

ADD COMMENT

Login before adding your answer.

Traffic: 813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6