Entering edit mode
7.4 years ago
Simo ▴ 50
Starting from microarray data, I retrieved the same positions in other populations from the 1000 genomes. In few cases I've found that for the same locus there are two different rs IDs or more.
Sometimes they are simply separated by a semicolon:
19 123 rs123; rs432
Other times they are reported as two different sites with different rs IDs:
19 123 rs879 19 123 rs123; rs432
How can I deal with them? And what does this multiple rs codes for a locus mean?
This might be due to rs Id are merged sometimes, but your question is not clear. Some real examples might help.
This is an example of what I got:
From Microarray data I have some positions with no rs ID (marked as
---), so I retrieved them from the 1000 genomes. Now, since I have the situation I've shown you, how can I know which rs ID should be taken for that position, and which has to be removed?
variation doesn't work that way. it isn't defined by a position only, but also by the allele chane. if several rs are positioned in the same location it means that there are several variations occurring there, so you'll have to see which variant is being tested on the microarray (chromosome, position, reference and alternative allele) and then get the corresponding rs.