Question: Clinically-Associated Snp'S
gravatar for Vova Naumov
5.6 years ago by
Vova Naumov210
Russia, Moscow
Vova Naumov210 wrote:

Hi! We are now trying to understand, what Illumina chip is better for medical condition testing. So I used this MySQL query to get list of clinically-associated SNP':

mysql --user=genome -A  -D hg19 -e '
  snp132 s
  s.bitfields LIKE 'clin%' '

So now I have a list of about 22000 rs and it is interesting what association is meant by the base. There was a question on Biostar ( that could help me, but since 16 july OMIM table is not more in genome database. And the question is how can I get a list of disases/conditions from this snp list?

database disease snp • 4.2k views
ADD COMMENTlink modified 7 months ago by Biostar ♦♦ 10 • written 5.6 years ago by Vova Naumov210

hg18 does not have table snp132; I think you must have used hg19.

ADD REPLYlink written 5.6 years ago by Neilfws46k

Sure, sorry, I'l change it

ADD REPLYlink written 5.6 years ago by Vova Naumov210
gravatar for Pierre Lindenbaum
5.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum91k wrote:

1) Register an access to the FTP site of omim: and download mim2gene:

$ curl -s  "" | head
# Mim Number    Type    Gene IDs    Approved Gene Symbols
100050    phenotype    -    -
100070    phenotype    100329167    -
100100    phenotype    -    -
100200    phenotype    -    -
100300    phenotype    100188340    -
100500    moved/removed    -    -
100600    phenotype    -    -
100640    gene    216    ALDH1A1
100650    gene/phenotype    217    ALDH2

get a list of the gene symbols:

~$ curl -s  "" |\
   egrep -v "#" | cut -d '  ' -f 4 | egrep -v '^\-$' |\
   sort | uniq > list1.txt

2) get your list of SNP associiated to the gene symbol. Something like:

mysql -N --user=genome -A  -D hg19 -e 'select  distinct
from snp132 as S,
kgXref as G,
knownGene as K where
    S.chrom=K.chrom and
    S.chromStart>=K.txStart and
    S.chromEnd<=K.txEnd and 
    /* AND something to restrict the result to YOUR list of SNPs or gene */
' | sort -t '    ' -k1,1 > list2.txt

3) use unix join to join the two lists:

join -1 1 -2 1 list1.txt list2.txt

you should get a list with two columns: the OMIM gene and your SNP.

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Pierre Lindenbaum91k

Thank you very much! Allways new that these unix commands are very useful. I also tried to use /OMIM/genemap file to get rs numbers from 12th column, but there wre only 209 common rs between clinically-associated and numbers from this file.

ADD REPLYlink written 5.6 years ago by Vova Naumov210
gravatar for Larry_Parnell
5.6 years ago by
Boston, MA USA
Larry_Parnell15k wrote:

dbSNP includes clinically significant variations and you can now filter search results on clinical significance, allele origin, minor allele frequency, and suspected false SNPs. See for more.

From : Clinical significance: The significance of the indicated allele.

The supported values are:


In dbSNP build 132, there are 13105 such rs entries. While no good diefinition of "clinical significance" is given, the above examples of what NCBI classifies as such can help to form a picture of what is meant by this term.

Edit added 13 Oct 2011: I have just learned from following the International Congress of Human Genetics meeting on Twitter that Rong Chen is painstakingly manually curating 5,478 disease-SNP association papers and adding the info to a database of 67,678 SNPs associated with 1,563 diseases.

ADD COMMENTlink modified 5.5 years ago • written 5.6 years ago by Larry_Parnell15k
gravatar for Khader Shameer
5.6 years ago by
Manhattan, NY
Khader Shameer17k wrote:

What do you mean by clinical association ? What is your criteria ?

Mendelian disease, Complex disease, Pharmacogenomic variants or combination two or more ?*

If you are interested in combined dataset you need to do raw-data-munging. OMIM is ideal for Mendelian variants, for complex disease variants you should check GWAS resources, for Pharmacogenomics variants check PharmGKB.To identify cinically-associated variants from GWAS see my discussion 1, 2 and 3. For pharmacogenomics variants, see list of Annotated SNPs by Disease in PharmGKB here. A combination of the 3 resources will give you a complete coverage of SNPs for your study.

*I recently integrated such a data-set for a manuscript using the approach discussed above.

ADD COMMENTlink written 5.6 years ago by Khader Shameer17k

I'm interested too what is meant in snp 132 under clinically-associated

ADD REPLYlink written 5.6 years ago by Vova Naumov210

@Vova: Please refer to Larry's answer !

ADD REPLYlink written 5.6 years ago by Khader Shameer17k

I'm interested too what is meant in snp 132 under clinically-saaociated

ADD REPLYlink written 5.6 years ago by Vova Naumov210
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1307 users visited in the last hour