I am working on some WGS annotation with VEP, I am outputting MutationAssessor and Polyphen2 values from dbNSFP plugin (among others) and I've noticed it gives me more than a output for several alleles. As example, VEP output line:
chr1 1535637 . C T 534.27 PASS AC=1;AF=0.5;AN=2;BaseQRankSum=-1.089;ClippingRankSum=0.876;DP=27;ExcessHet=3.0103;FS=13.64;MQ=60;MQ0=0;MQRankSum=1.83;QD=19.79;ReadPosRankSum=0.451;SOR=0.61;VQSLOD=15.92;culprit=MQ;CSQ=T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000339113|protein_coding||||||||||rs187039783|952|1|cds_start_NF|SNV|HGNC|HGNC:25567||2|||ENSP00000339421||H0Y2W2|UPI000059CF52|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000378733|protein_coding|3/4||ENST00000378733.8:c.325G>A|ENSP00000368007.4:p.Gly109Ser|336|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186|YES|2|P1|CCDS44040.1|ENSP00000368007|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|hmmpanther:PTHR28666&Pfam_domain:PF15207||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378755|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567|YES|2||CCDS31.1|ENSP00000368030|Q9NVI7||UPI000013D456|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000378756|protein_coding||||||||||rs187039783|952|1||SNV|HGNC|HGNC:25567||1|P1|CCDS53259.1|ENSP00000368031|Q9NVI7||UPI000006CE90|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|missense_variant|MODERATE|TMEM240|ENSG00000205090|Transcript|ENST00000425828|protein_coding|4/5||ENST00000425828.1:c.325G>A|ENSP00000400311.1:p.Gly109Ser|368|325|109|G/S|Ggc/Agc|rs187039783||-1||SNV|HGNC|HGNC:25186||2|P1|CCDS44040.1|ENSP00000400311|Q5SV17||UPI0000418FB6|1|tolerated_low_confidence(0.11)|possibly_damaging(0.886)|Pfam_domain:PF15207&hmmpanther:PTHR28666||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||7|0.0013977635782747603||0.0|1|0.001440922190201729|2|0.001984126984126984|3|0.002982107355864811|1|0.0010224948875255625|24.4|0.001445086705202312|0.0021998742928975488|20|5.940e-04|1|5.453e-04|||14|7.533e-04|1|1.953e-03|||10|1.427e-03|2|2.555e-04|.&.|.&.|3.36||||||L&L|0.805&0.805|D&D|0.999951&0.999951|D&D|0.999&0.999|P&P|0.886&0.886|0.74867|0.443|||||||||||1.785837e-03||3.830671e-03|,T|downstream_gene_variant|MODIFIER|ATAD3A|ENSG00000197785|Transcript|ENST00000536055|protein_coding||||||||||rs187039783|950|1||SNV|HGNC|HGNC:25567||2||CCDS53260.1|ENSP00000439290|Q9NVI7||UPI000048B049|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|downstream_gene_variant|MODIFIER|TMEM240|ENSG00000205090|Transcript|ENST00000624426|protein_coding||||||||||rs187039783|3419|-1||SNV|HGNC|HGNC:25186||3|||ENSP00000485135||A0A096LNN7|UPI0004F2365B|1|||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||,T|regulatory_region_variant|MODIFIER|||RegulatoryFeature|ENSR00000000199|open_chromatin_region||||||||||rs187039783||||SNV||||||||||||||||0.0014|0|0.0014|0.002|0.003|0.001|0.001445|0.0022|0.0009889|0.0005315|0.0002926|0.001708|9.808e-05|0.0007297|0.001794|0.0005214|0.0003071|0.003|EUR||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| GT:AB:AD:DP:GQ:PL 0/1:0.3:8,19:27:99:568,0,197
as you can see for the second allele the MutationAssessor_pred=L&L, MutationAssessor_score=0.805&0.805, at the same manner Polyphen2_HDIV_pred=D&D and the same for all the rest of MutationAssessor and Polyphen2 fields
Does anybody knows why?
Thank you very much in advance for any help!
Thank you so much @Ben_Ensembl for your answer. I'm not getting which transcripts you're referring to, in the entry with multiple values of mutationassessor, humdiv etc. shouldn't the transcript refers only to
ENSP00000368007.4?No problem- very happy to help. I'd need to see the options you included in your command-line query and the full VEP output for this variant to give a more detailed explanation but this variant falls within the CDS of 2 different transcripts of the TMEM240 gene (TMEM240-201 and TMEM240-202): http://www.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000205090;r=1:1535207-1540453;t=ENST00000425828;tl=rdX3s6Fq5SiFCEcH-149700
I suspect that the two values for MutationAssessor_score and Polyphen2_HDIV_pred refer to predictions for the effects predicted on the proteins encoded by these two transcripts.
Best wishes
Ben Ensembl Helpdesk
Thank you gain Ben. Here the code:
Hi cocchi.e89,
The dbNSFP plugin only returns the data for a specified column in the dbNSFP file. For example, when you ask for a given variant the value for MutationAssessor_score which is stored in the MutationAssessor_score column of the dbNSFP file.
The readme file describes which values are expected for a certain column. If you search for MutationAssessor_score in: https://drive.google.com/file/d/1HPcIEE1USiwCPzRFFwZhTZagka1uyUV1/view
you get: (MutationAssessor functional impact combined score (MAori). The score ranges from -5.17 to 6.49 in dbNSFP. Multiple entries are separated by ";", corresponding to Uniprot_entry.)
VEP is replacing ; with & because ; is a special character in the VEP output.
Best wishes
Ben Ensembl Helpdesk