Question: SNPs based on cDNA sequence could be artifacts of RNA editing
I'm using (dbSNP Build 151) and I would to discard all variants SNPs derived from cDNA. I read from the website there should be a flag indicating which SNPs were obtained from cDNA, but I could not identify it. Is flag present in vcf file? Or I should download some other file?

Best regards, Salvatore

Dear Salvatore. I have edited my answer. The Molecule Type is actually a greater indicator of these events, and not the weight.

University College London Cancer Institute
dbSNP / NCBI does not explicitly report these. From the FAQ (Frequently Asked Questions):

I'm concerned that submitted SNPs based on cDNA sequence could be artifacts of RNA editing. Does dbSNP track potential RNA-edited artifacts?

We do not directly track SNPs that could be potential RNA-edited artifacts. However, we do flag SNPs derived from cDNA that DO NOT map to the genome contigs but DO map to RefSeq transcripts.



They appear to be flagged by the 'Molecule Type' in the dbSNP records, with them being marked as 'cDNA'. Looking at some of the published studies on this topic, namely:

Of the variants at which I have looked from these studies, all do indeed have a Molecule Type of 'cDNA'. For example:


...look to the left, at Molecule Type.

Keep in mind, that this is not absolute proof that these are RNA editing events. These are just variants that do not map to the genome references but that do appear in cDNA.


