Question: ClinVar download sources
1
gravatar for Vivek
6.0 years ago by
Vivek2.4k
Denmark
Vivek2.4k wrote:

Just wanted to check if anyone else is having issues downloading the VCF files from the FTP source for Clinvar.

ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/

This is the error I'm running into

550 /pub/clinvar/vcf_GRCh37: No such file or directory

Looks the directory is missing since the last update. 

Any other alternate sources to get this from? I also observed that the web version has a lot more variants per gene than the version offered for download from the previous release. I hope this will be fixed in the current release.

 

clinical snp clinvar ftp ncbi • 3.9k views
ADD COMMENTlink modified 6.0 years ago by Sean Davis26k • written 6.0 years ago by Vivek2.4k
1

Don't feel too aggrieved, the vcf_GRC38 link is broken too.

ADD REPLYlink written 6.0 years ago by Daniel Swan13k
1
gravatar for Sean Davis
6.0 years ago by
Sean Davis26k
National Institutes of Health, Bethesda, MD
Sean Davis26k wrote:

The dbSNP site has the clinvar VCF (and has for a while).  

ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/

As of today, the header contains the following metadata:

#fileformat=VCFv4.0
##fileDate=20141009
##source=ClinVar and dbSNP
##dbSNP_BUILD_ID=142
##reference=GRCh38

 

ADD COMMENTlink written 6.0 years ago by Sean Davis26k

Thanks! Noticed today that the Clinvar FTP is also up and running. Hopefully they consolidated the differences between web and download versions.

ADD REPLYlink written 6.0 years ago by Vivek2.4k
0
gravatar for Charles Warden
6.0 years ago by
Charles Warden7.9k
Duarte, CA
Charles Warden7.9k wrote:

Depending upon your application, you may be able to use tools like ANNOVAR (or ANNOVAR's database file) to get the relevant ClinVar stats:

http://www.openbioinformatics.org/annovar/annovar_filter.html#clinvar

I personally haven't tried to define the ClinVar variants from scratch

ADD COMMENTlink written 6.0 years ago by Charles Warden7.9k

Yes I did see that. However like I mentioned there is a difference in number of variants available from the web version and the previous download release. For example the gene CYP27A1 has 67 pathogenic variants from the web query

http://www.ncbi.nlm.nih.gov/clinvar/?term=CYP27A1

However only 15 show up in the FTP release. I was hoping the latest release from earlier this month would have fixed that.

ADD REPLYlink written 6.0 years ago by Vivek2.4k
1

Yeah, there might be some sort of delay.  You could try contacting somebody from NCBI to see if they can help (or at least confirm that the newer annotations are not currently available from the FTP):

https://www.ncbi.nlm.nih.gov/About/glance/contact_info.html

ADD REPLYlink written 6.0 years ago by Charles Warden7.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2037 users visited in the last hour