I do not have a biological background, however I am working on a automatic extraction tool for different mutations. I was wondering when it comes to SNP identifiers (rs#), in some articles authors tend to also mention a letter at the end of the identifier (ex: rs688T). So what does the letter represent and is extracting the identifier only (rs688) better or not.
I see it in a 2007 paper. Maybe the position had multiple ALT alleles and they used that notation to show which one they were talking about? Doesn't make sense to me though.
I see 5 variants merged into rs688 - your example, and all their ALT alleles are shown as T. Could you provide another example maybe?
These papers seem to be using rs# as a shortcut to the chr position. Unfortunately, most publications do not verify nomenclature accuracy before accepting papers and it is left to database curators to judge their veracity. This is non standard practice and you can assume these characters insignificant. Sometimes they're referring to the REF allele (rs688C), sometimes to the ALT (rs688T), and sometimes to the genotype (rs4803457CC).
Ideally, they should be referred to as chr:posBase
Well I found several occurrences in PubMed articles, here is a few; rs6152A: https://www.ncbi.nlm.nih.gov/pubmed/20450840 rs4803457CC: https://www.ncbi.nlm.nih.gov/pubmed/27796807 rs6152GG: https://www.ncbi.nlm.nih.gov/pubmed/20450840 rs688C: https://www.ncbi.nlm.nih.gov/pubmed/25188588 rs688T: https://www.ncbi.nlm.nih.gov/pubmed/25188588 I am lost on the biological meaning of the C/G/T/A char. at the end of the identifier and its importance.
These papers seem to be using rs# as a shortcut to the chr position. Unfortunately, most publications do not verify nomenclature accuracy before accepting papers and it is left to database curators to judge their veracity. This is non standard practice and you can assume these characters insignificant. Sometimes they're referring to the
REF
allele (rs688C), sometimes to theALT
(rs688T), and sometimes to the genotype (rs4803457CC).Ideally, they should be referred to as
chr
:posBase