Question: how to find SNP positions (for non-bioinformaticians)
gravatar for CrazyB
4.7 years ago by
United States
CrazyB210 wrote:

First off, I apologize for posting this "old" inquiry. I know similar inquiries were put out before, but I am hoping to find new solutions to this inquiry.

I am trying to find the positions of a list of SNPs (given the rs#). Need a "new" solution.  

What I have tried so far -

(a) sending a batch query to dbSNP at NCBI, which worked well in the past, but today ~10 hr after sending the batch query, no return of result yet ( is the server down ??)

(b) downloading all dbSNP positions from Biomart and hoping to do some "intersection" to find the positions for specific rs#. The download somehow was terminated prematurely (first download took ~ 1+ hr).

(c) downloading cruzdb. cruzdb was suggested as a solution in one of the earlier posts. I read the document and still could not run it - my apology ! (does running cruzdb require an understanding of python ?? which I currently don't possess) Having to say it though, in contrast to cruzdb doc, I had better luck with vcftools and plink thanks to their "more friendly" documents.

Is there any other solutions that allow non-bioinformaticians to find answers to this task (i.e. positions for a list of SNPs) ?

I certainly hope to get some useful responses, but It's understandable if the admin chooses to close this thread (due possibly to "duplication of questions"). Thank you


snp dbsnp biomart • 2.5k views
ADD COMMENTlink modified 3.9 years ago by Alex Reynolds29k • written 4.7 years ago by CrazyB210

How many rs# you have got? You can give UCSC table browser and give a list of rsIDs (< 1000) and select whatever information you need in the output file.  

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Ashutosh Pandey12k

Thanks. Will try UCSC table and see how it runs. I have only ~ 1000 rs, so wasn't sure why dbSNP database failed me yesterday.

ADD REPLYlink written 4.7 years ago by CrazyB210
gravatar for Nicola Casiraghi
4.7 years ago by
Germany, Heidelberg, DKFZ EMBL
Nicola Casiraghi450 wrote:

Hi, as suggested by Ashutosh Pandey, you can exploit USCS table browser. Select genome and assembly of interest and from group: menu select Variation. From track: menu select All SNPs(142). Paste or upload your rs ids using buttons at identifiers (names/accessions): . To export your results, select selected fields from primary and related tables from output format: then click get output. In the next step you can select fields of interest (i.e. input rds, chromosome, genomic position) that will be included in final output table, click get output to retrieve it.

ADD COMMENTlink written 4.7 years ago by Nicola Casiraghi450
gravatar for Emily_Ensembl
4.7 years ago by
Emily_Ensembl20k wrote:

Don't download all the variants from BioMart whatever you do! There are >114M variants in human and BioMart cannot do that – that's why it's failing. Use the Variation database and filter by Variation name, then input your list of IDs.

ADD COMMENTlink modified 4.7 years ago • written 4.7 years ago by Emily_Ensembl20k
gravatar for Alex Reynolds
3.9 years ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

You can use the mysql client to download a BED file containing the SNP position and rs* ID:

$ mysql --user=genome -A -N -D hg19 -e 'SELECT chrom, chromStart, chromEnd, name FROM snp144Common' > snp144Common.bed

This download took about 4-5 minutes to complete.

The complete schema for the snp144Common table is available from UCSC here — in the example above, we retrieve data for the chrom, chromStart, chromEnd and name fields. You can add other fields to the example command if they are useful to you, such as observed and func annotations, etc.

Once you have a list of SNPs, you can use awk to find the position of a single SNP, given the ID.

For example:

$ awk -v id='rs10409603' '{ if ($4 == id) { print $0; exit; } }' snp144Common.bed
chr19   8313572 8313573 rs10409603

If you have a list of IDs, you can use grep -F -f <filename> and pass in a file containing a list of IDs to do fixed-string (quick) searches against.

For example:

$ grep -F -f list-of-SNP-IDs.txt snp144Common.bed > answer.bed

Learning a few basics of doing things on the command line will pay massive dividends, in the long term.

ADD COMMENTlink written 3.9 years ago by Alex Reynolds29k

Could you tell what the difference between snp144.txt.gz and snp144Common? The former contains more than 130 million SNPs while snp144Common only contains 14760200 SNPs. Thank you very much!

ADD REPLYlink written 7 months ago by yliueagle220

I think this page answers my question: Thank you!

ADD REPLYlink written 7 months ago by yliueagle220
gravatar for Pierre Lindenbaum
4.0 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum126k wrote:

For non-bioinformatician: I would use

- download and open the knime workbench ( ), create a new workflow

- download snp from UCSC and open it in the workflow ('read File' node)

- load your list of SNP name using a 'Read file' node.

- Use a 'join Node' to get the intersection on both previous node using the snp name.


ADD COMMENTlink written 4.0 years ago by Pierre Lindenbaum126k
gravatar for Ibrahim Tanyalcin
3.9 years ago by
Ibrahim Tanyalcin1.0k wrote:


I have created a software for myself a year ago for visualizing SNVs for a specific gene name. Whether you use a VCF file, or a variant file from Biomart, you can generate these graphs for a given gene. If your SNPs/SNVs are gene based, you can easily generate graphs like this:

Maybe it helps,

ADD COMMENTlink written 3.9 years ago by Ibrahim Tanyalcin1.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 739 users visited in the last hour