Variant annotation: Multiple values in some fields like SIFT_Pred
5.9 years ago
mavershang ▴ 50

Hi there,

In variant annotating with SnpEff + dbNSFP, there are many variants assigned with more than one value in some fields. For example, as highlighted in yellow as below, SIFT_Pred has 5 "T" (but only one SIFT_Score), Polyphen2_HDIV_pred has 2 "B".

If I understand it correctly, each variant should only have one SIFT_score, SIFT_Pred, etc. Did I miss anything?

7    140301731    .    G    T    2918.77    PASS    AC=2;AF=1.00;AN=2;DP=95;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;MQ0=0;POSITIVE_TRAIN_SITE;QD=30.72;VQSLOD=6.96;culprit=FS;ANN=T|missense_variant|MODERATE|DENND2A|ENSG00000146966|transcript|ENST00000275884|protein_coding|2/19|c.467C>A|p.Pro156His|885/3735|467/3030|156/1009||;dbNSFP_rs_dbSNP141=rs269243;dbNSFP_CADD_phred=11.67;dbNSFP_ESP6500_AA_AF=0.989765;dbNSFP_ExAC_AF=9.549e-01;dbNSFP_UniSNP_ids=rs269243,rs1681775,rs52835101,rs58245328,rs269243;dbNSFP_SIFT_pred=T,T,T,T,T;dbNSFP_Polyphen2_HDIV_rankscore=0.02656;dbNSFP_Polyphen2_HDIV_pred=B,B;dbNSFP_ESP6500_EA_AF=0.936014;dbNSFP_SIFT_converted_rankscore=0.12818;dbNSFP_Polyphen2_HVAR_rankscore=0.01281;dbNSFP_1000Gp1_AF=0.9647435897435898;dbNSFP_Polyphen2_HVAR_pred=B,B    GT:AD:DP:GQ:PL    1/1:0,95:95:99:2947,286,0
5.9 years ago
Pablo ★ 1.9k

dbNSFP provides multiple values for those scores and SnpEff simply annotates them all.

The reason dbNSFP provides different values is related to how each score is calculated. For instance the number of PolyPhen entries usually  the same as the number of Uniprot IDs, similarly the number of SIFT entries relates to the number of "Ensembl_transcriptID" entries.