Tutorial:Automated dbSNP lookup by rsID position, plus genome build liftover
0
3
Entering edit mode
7 weeks ago

Hola, just passing by to say 'hi'.

Please post bugs / suggestions as comments to this tutorial.

Automated dbSNP rsID position lookup

rsID to position

GRCh38

cat rsids.list
rs1296488112
rs1226262848
rs1225501837
rs1484860612
rs1235553513
rs1424506967

cat rsids.list | while read rsid ;
do
  pos=$(curl -sX GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text" | sed 's/<\//\n/g' | grep -o -P '\<CHRPOS\>.{0,15}' | cut -f2 -d">") ;
  echo -e "${pos}""\t""${rsid}" ;
done ;

1:13616 rs1296488112
1:13621 rs1226262848
1:13623 rs1225501837
1:13634 rs1484860612
1:13646 rs1235553513
1:13647 rs1424506967

rsID to position

GRCh37

This is the same as above, except that the tag returned that contains the GRCh37 co-ordinates is CHRPOS_PREV_ASSM, not CHRPOS:

cat rsids.list
rs1296488112
rs1226262848
rs1225501837
rs1484860612
rs1235553513
rs1424506967

cat rsids.list | while read rsid ;
do
  pos=$(curl -sX GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&id=$rsid&retmode=text&rettype=text" | sed 's/<\//\n/g' | grep -o -P '\<CHRPOS_PREV_ASSM\>.{0,15}' | cut -f2 -d">") ;
  echo -e "${pos}""\t""${rsid}" ;
done ;

1:13616 rs1296488112
1:13621 rs1226262848
1:13623 rs1225501837
1:13634 rs1484860612
1:13646 rs1235553513
1:13648 rs1424506967

Position to rsID

GRCH38

Here, we have to be wary that more than a single rsID can map to a single position.

cat grch38.bed
chr1    13616   13616
chr1    13621   13621
chr1    13623   13623
chr1    13634   13634
chr1    13646   13646
chr1    13647   13647

sed 's/^chr//g' grch38.bed | cut -f1,2 | while read chr pos ;
do
  rsid=$(curl -sX GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=snp&term=$pos%5BPOSITION%5D+AND+$chr%5BCHR%5D" | sed 's/<\//\n/g' | grep -e "<Id>" | cut -f2 -d">" | awk '{print "rs"$0}' | tr '\n' ',' | sed 's/,$//g') ;
  echo -e "${chr}"":""${pos}""\t""${rsid}" ;
done ;

1:13616 rs1296488112
1:13621 rs1226262848
1:13623 rs1225501837
1:13634 rs1639595957,rs1484860612
1:13646 rs1235553513
1:13647 rs1424506967

GRCH37

cat grch37.bed
chr1    13616   13616
chr1    13621   13621
chr1    13623   13623
chr1    13634   13634
chr1    13646   13646
chr1    13648   13648

sed 's/^chr//g' grch38.bed | cut -f1,2 | while read chr pos ;
do
  rsid=$(curl -sX GET "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=snp&term=$pos%5BPOSITION_GRCH37%5D+AND+$chr%5BCHR%5D" | sed 's/<\//\n/g' | grep -e "<Id>" | cut -f2 -d">" | awk '{print "rs"$0}' | tr '\n' ',' | sed 's/,$//g') ;
  echo -e "${chr}"":""${pos}""\t""${rsid}" ;
done ;

1:13616 rs1296488112
1:13621 rs1226262848
1:13623 rs1225501837
1:13634 rs1639595957,rs1484860612
1:13646 rs1235553513
1:13648 rs1424506967

.



Automated genome build lift-over

For automated lift-over in R via rtracklayer, please try this:

  # chain file for hg19 to hg38
    download.file('http://hgdownload.soe.ucsc.edu/goldenPath/hg19/liftOver/hg19ToHg38.over.chain.gz', 'hg19ToHg38.over.chain.gz')
    R.utils::gunzip('hg19ToHg38.over.chain.gz')

  # chain file for hg38 to hg19
    download.file('http://hgdownload.soe.ucsc.edu/goldenPath/hg38/liftOver/hg38ToHg19.over.chain.gz', 'hg38ToHg19.over.chain.gz')
    R.utils::gunzip('hg38ToHg19.over.chain.gz')

  # import variant positions
    positions <- read.table('grch37.bed', sep = '\t', header = FALSE, stringsAsFactors = FALSE)

  # perform liftover, assuming hg19 to hg38
    library(rtracklayer)
    grObject <- GRanges(
      seqnames = positions[,1],
      ranges = IRanges(start = positions[,2], end = positions[,2]))
    chainObject <- import.chain('hg19ToHg38.over.chain')
    as.data.frame(liftOver(grObject, chainObject))[,c('seqnames','start','end','width')]

  seqnames start   end width
1     chr1 13616 13616     1
2     chr1 13621 13621     1
3     chr1 13623 13623     1
4     chr1 13634 13634     1
5     chr1 13646 13646     1
6     chr1 13647 13647     1

Gracias,

Kevin

rsid hg19 liftover hg38 dbsnp • 201 views
ADD COMMENT

Login before adding your answer.

Traffic: 1843 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6