How to download dbSNP153 vcf files in hg19/GRCH37 version
3
3
Entering edit mode
21 months ago
Shicheng Guo ★ 8.7k

Hi All,

I notice dbSNP152 has been updated to dbSNP153 when I search rs533316401

https://www.ncbi.nlm.nih.gov/snp/rs533316401

Released July 9, 2019

Who has the VCF files for dbSNP153 in hg19 genomic assembly version?

Thanks.

Okay. With the help from xx and xx, the problem solved:

Here is hg19:

wget https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.25.gz
wget https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.25.gz.tbi
wget https://raw.githubusercontent.com/Shicheng-Guo/AnnotationDatabase/master/GCF_000001405.25_GRCh37.p13_assembly_report.txt

Here is hg38:

wget https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.38.gz
wget https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.38.gz.tbi

NB: hg38 and hg19

dbsnp153 vcf • 3.0k views
ADD COMMENT
0
Entering edit mode

Thank you Shicheng Guo for this information.

It's also important to highlight here that, this VCF version, chrom names are different. They are in NCBI format (something like NC_... or NT_...). Also, I didn't find variants from MT chrom on those VCF files.

If you want this chrom names in UCSC (hg19) format (or other format), maybe you'll need other steps. You should read this and this

For now, it was the best solution which I found as I need chrom names in UCSC hg19 format.

Hope this helps!

ADD REPLY
3
Entering edit mode
21 months ago
igor 12k

Both versions are available on the FTP site. GCF_000001405.25 is the RefSeq assembly accession corresponding to GRCh37.p13.

RefSNP VCF files for GRC (Genome Reference Consortium) human assembly 37 (GCF_000001405.25) and 38 (GCF_000001405.38). Files are compressed by bgzip and with the tabix index.

Source: https://ftp.ncbi.nih.gov/snp/archive/b153/00readme.txt

ADD COMMENT
2
Entering edit mode
21 months ago

I don't think there is one available at the moment, but you can first get one for hg38:

wget -4 -c https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.38.gz

wget -4 -c https://ftp.ncbi.nih.gov/snp/redesign/latest_release/VCF/GCF_000001405.38.gz.tbi

Then, liftover it to GRCH37/hg19 using crossmap: http://crossmap.sourceforge.net/

python CrossMap.py vcf hg38Tohg19.over.chain.gz GCF_000001405.38.gz hg19.fa  GCF_000001405.hg19.vcf
ADD COMMENT
0
Entering edit mode
5 weeks ago
Shicheng Guo ★ 8.7k

dbSNP154 is coming, share a script for preprocessing

## 05/09/2021: 2020-05-26 13:48 -- dbSNP154
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.25.gz ./
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.25.gz.md5 ./        
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.25.gz.tbi ./     
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.25.gz.tbi.md5 ./   
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.38.gz ./         
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.38.gz.md5 ./      
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.38.gz.tbi ./    
wget https://ftp.ncbi.nih.gov/snp/archive/b154/VCF/GCF_000001405.38.gz.tbi.md5 ./  
wget https://raw.githubusercontent.com/Shicheng-Guo/AnnotationDatabase/master/GCF_000001405.25_GRCh37.p13_assembly_report.txt ./
wget https://raw.githubusercontent.com/Shicheng-Guo/AnnotationDatabase/master/GCF_000001405.38_GRCh38.p12_assembly_report.txt ./
awk -v RS="(\r)?\n" 'BEGIN { FS="\t" } !/^#/ { if ($10 != "na") print $7,$10; else print $7,$5 }' GCF_000001405.25_GRCh37.p13_assembly_report.txt > dbSNP-to-UCSC-GRCh37.p13.map
awk -v RS="(\r)?\n" 'BEGIN { FS="\t" } !/^#/ { if ($10 != "na") print $7,$10; else print $7,$5 }' GCF_000001405.38_GRCh38.p12_assembly_report.txt > dbSNP-to-UCSC-GRCh38.p12.map
#sed -i '{s/chrX/23/g}' dbSNP-to-UCSC-GRCh37.p13.map
#sed -i '{s/chrY/24/g}' dbSNP-to-UCSC-GRCh37.p13.map
#sed -i '{s/chrM/25/g}' dbSNP-to-UCSC-GRCh37.p13.map
#sed -i '{s/chr//g}' dbSNP-to-UCSC-GRCh37.p13.map
sbatch --job-name=dbsnp154 --output=dbsnp154.out ~/bin/sbatch.sh 'bcftools annotate --threads 48 --rename-chrs dbSNP-to-UCSC-GRCh37.p13.map GCF_000001405.25.gz -o dbSNP154.hg19.vcf.gz'
sbatch --job-name=hg38 --mem=24G --output=hg38 ~/bin/sbatch.sh 'bcftools annotate --threads 48 --rename-chrs dbSNP-to-UCSC-GRCh38.p12.map GCF_000001405.38.gz -o dbSNP154.hg38.vcf.gz'
ADD COMMENT

Login before adding your answer.

Traffic: 1985 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6