dbsnp in vcf format compatible with hg38
4
0
Entering edit mode
5.3 years ago
apocalyps52 ▴ 40

Hello,

I need to annotate the snps in a vcf file that I generated from hg38 reference. I try to use SnpSift for annotation however I can't find a dbsnp.vcf compatible with hg38. All links I found online guide me to NCBI dbSNP ftp server where the reference files are GRCh38 build.

Anybody can help me to find the hg38 dbsnp vcf file ?

Thanks in advance.

snp database hg38 Assembly • 3.7k views
ADD COMMENT
1
Entering edit mode

they are the same! Since the last release (GRCh38, Dec. 2013), the hgxxx and GRChxxx naming style have been made uniform. So GRCh38 and hg38 are the same genome ver.

The prev. was GRCh37 or hg19.

ADD REPLY
3
Entering edit mode
5.3 years ago
apocalyps52 ▴ 40

Thanks guys for your suggestions. I found the following perl script that solved my problem.

perl -pe 's/^([^#])/chr\1/' myfile.vcf > newfile.vcf

ADD COMMENT
1
Entering edit mode
5.3 years ago
agata88 ▴ 820

As far as I know GRCh38 is hg38 :) And GRCh37 is hg19. Best, Agata

ADD COMMENT
1
Entering edit mode
5.3 years ago
agata88 ▴ 820

If I understand correctly, you have incompatibility in files: in one there is chr1 and in other 1. If this is a problem the easiest way is to change VCF file, replace 1 to chr1 or opposite. You can do it by importing file to excel and then use function replace ...or write your own script for example in python.

Best,

Agata

ADD COMMENT
2
Entering edit mode

No, no, no, no importing in excel to change that. Just use a unix tool like sed or use the same data for mapping and for annotation to avoid these problems altogether.

ADD REPLY
1
Entering edit mode

Why you all are so "NO" to excel, tool like any other :)

ADD REPLY
4
Entering edit mode

I work in a mainly wet lab group and I've seen terrible things happening in excel which will likely haunt me forever and result in years of therapy.

ADD REPLY
1
Entering edit mode

Using Excel instead of sed in this situation is like buying a whole shed's worth of tools when all you need is a screwdriver.

ADD REPLY
1
Entering edit mode

In addition, you take your hammer out of that shed after buying it and knock the screw in the wood with the handle.

ADD REPLY
0
Entering edit mode

Everybody has an excel (especially students)- so why don't use it :) I can agree that it is maybe not a perfect solution in this situation ...but still a solution.

ADD REPLY
0
Entering edit mode

I used excel to make changes in my vcf file. It doesn't work as vcf file after saving it.

ADD REPLY
0
Entering edit mode

You need to copy columns by yourself to txt file (notebook) and save it with .vcf :) Don't give up it will work :) PS. Don't forget the data with # from your raw vcf to add at the beginning.

ADD REPLY
0
Entering edit mode

I know you mean well but this is just bad advice, can't put it any nicer.

ADD REPLY
0
Entering edit mode
5.3 years ago
apocalyps52 ▴ 40

Hello,

Thanks for your answer.

As I mentioned, my original reference is UCSC format so the chromosome annotation is "chrX" whereas GRCh38 use only number for chromosome.

How can I overcome this incompatibility ?

Thanks

ADD COMMENT
0
Entering edit mode

Go back to the original data and to it again with the reference you' d like to use now. Or maybe the dbSNP file from the GATK bundle: https://software.broadinstitute.org/gatk/download/bundle

ADD REPLY
0
Entering edit mode

@Devon Ryan has done the mapping for you as explained in this post. Just go to his repository to get the mapping from GRCh38 nomenclature to hg38 one.

ADD REPLY

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6