Question: How to collect SNP dataset from databases for SNP analysis?
0
gravatar for arr234
20 days ago by
arr2340
arr2340 wrote:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5449402/ - In this study, it is mentioned that human APOE gene has 183 validated SNPs out of which 31 are missense, 21 are synonymous, 2 are nonsense, 98 are intronic, 7 are 5′ UTR, 6 are 3′ UTR, 7 are downstream, 8 are upstream, 1 is splice donor and 2 are splice acceptor variants. This data is collected using dbSNP. I would like to know how to collect these validated SNP dataset from dbSNP.

databases snp • 159 views
ADD COMMENTlink modified 11 days ago by Pierre Lindenbaum116k • written 20 days ago by arr2340

ExSNP database (http://www.exsnp.org/DZeQTL)

ADD REPLYlink written 20 days ago by maryamtavasoli710
1
gravatar for Kevin Blighe
11 days ago by
Kevin Blighe35k
Republic of Ireland
Kevin Blighe35k wrote:

The link provided by maryamtavasoli71 relates to eQTL studies, which is not what you want.

If you are not comfortable using the command line and in working with the dbSNP data locally, then you can just use the Ensembl Genome Browser to look up all variants in a particular gene. HERE is a search configured for APOE:

g

-----------------------------------------

Click on the Excel® sheet icon (at right) in order to download the data as CSV: j

The data contains scores from in silico predictors, like SIFT, PolyPhen, MutationAssessor, CADD, etc. I did my own quick filtering and more or less identified ~200 'damaging' variants in the gene.

Kevin

ADD COMMENTlink modified 11 days ago • written 11 days ago by Kevin Blighe35k
1
gravatar for Pierre Lindenbaum
11 days ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum116k wrote:

using mysq ucsc

$ mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -P 3306 -D hg38 -e 'select func,valid,count(*) from snp142 where chrom="chr19" and chromStart>=44905749 and chromEnd<=44909395 group by func,valid'
+----------------------------+--------------------------------------------------------+----------+
| func                       | valid                                                  | count(*) |
+----------------------------+--------------------------------------------------------+----------+
| coding-synon               | unknown                                                |       13 |
| coding-synon               | by-frequency                                           |        1 |
| coding-synon               | by-1000genomes                                         |        3 |
| coding-synon               | by-cluster,by-1000genomes                              |        2 |
| coding-synon               | by-frequency,by-1000genomes                            |        2 |
| intron                     | unknown                                                |       25 |
| intron                     | by-cluster                                             |        4 |
| intron                     | by-1000genomes                                         |       33 |
| intron                     | by-cluster,by-1000genomes                              |        3 |
| intron                     | by-frequency,by-1000genomes                            |       24 |
| intron                     | by-cluster,by-frequency,by-1000genomes                 |       11 |
| near-gene-5                | by-cluster,by-frequency,by-1000genomes                 |        1 |
| nonsense                   | unknown                                                |        2 |
| missense                   | unknown                                                |       28 |
| missense                   | by-cluster                                             |        8 |
| missense                   | by-1000genomes                                         |       11 |
| missense                   | by-frequency,by-1000genomes                            |        7 |
| missense                   | by-cluster,by-frequency,by-1000genomes                 |        1 |
| missense                   | by-cluster,by-frequency,by-2hit-2allele,by-1000genomes |        2 |
| missense                   | by-frequency,by-hapmap,by-1000genomes                  |        1 |
| intron,missense            | by-1000genomes                                         |        1 |
| intron,missense            | by-frequency,by-1000genomes                            |        1 |
| intron,missense            | by-cluster,by-frequency,by-hapmap,by-1000genomes       |        1 |
| frameshift                 | unknown                                                |        1 |
| cds-indel                  | unknown                                                |        2 |
| untranslated-3             | unknown                                                |        2 |
| untranslated-3             | by-1000genomes                                         |        1 |
| untranslated-3             | by-frequency,by-1000genomes                            |        3 |
| untranslated-5             | by-1000genomes                                         |        3 |
| intron,untranslated-5      | by-1000genomes                                         |        1 |
| intron,untranslated-5      | by-frequency,by-1000genomes                            |        1 |
| near-gene-5,untranslated-5 | by-1000genomes                                         |        1 |
| splice-3                   | unknown                                                |        1 |
| splice-3                   | by-cluster                                             |        1 |
| splice-5                   | by-cluster                                             |        1 |
+----------------------------+--------------------------------------------------------+----------+
ADD COMMENTlink written 11 days ago by Pierre Lindenbaum116k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1156 users visited in the last hour