Question: dbsnp in vcf format compatible with hg38
0
gravatar for apocalyps52
2.8 years ago by
apocalyps5240
apocalyps5240 wrote:

Hello,

I need to annotate the snps in a vcf file that I generated from hg38 reference. I try to use SnpSift for annotation however I can't find a dbsnp.vcf compatible with hg38. All links I found online guide me to NCBI dbSNP ftp server where the reference files are GRCh38 build.

Anybody can help me to find the hg38 dbsnp vcf file ?

Thanks in advance.

snp hg38 database assembly • 2.0k views
ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by apocalyps5240
1

they are the same! Since the last release (GRCh38, Dec. 2013), the hgxxx and GRChxxx naming style have been made uniform. So GRCh38 and hg38 are the same genome ver.

The prev. was GRCh37 or hg19.

ADD REPLYlink written 2.8 years ago by Amitm1.6k
3
gravatar for apocalyps52
2.8 years ago by
apocalyps5240
apocalyps5240 wrote:

Thanks guys for your suggestions. I found the following perl script that solved my problem.

perl -pe 's/^([^#])/chr\1/' myfile.vcf > newfile.vcf

ADD COMMENTlink written 2.8 years ago by apocalyps5240
1
gravatar for agata88
2.8 years ago by
agata88770
Poland
agata88770 wrote:

As far as I know GRCh38 is hg38 :) And GRCh37 is hg19. Best, Agata

ADD COMMENTlink modified 2.8 years ago • written 2.8 years ago by agata88770
1
gravatar for agata88
2.8 years ago by
agata88770
Poland
agata88770 wrote:

If I understand correctly, you have incompatibility in files: in one there is chr1 and in other 1. If this is a problem the easiest way is to change VCF file, replace 1 to chr1 or opposite. You can do it by importing file to excel and then use function replace ...or write your own script for example in python.

Best,

Agata

ADD COMMENTlink written 2.8 years ago by agata88770
2

No, no, no, no importing in excel to change that. Just use a unix tool like sed or use the same data for mapping and for annotation to avoid these problems altogether.

ADD REPLYlink written 2.8 years ago by WouterDeCoster38k
1

Why you all are so "NO" to excel, tool like any other :)

ADD REPLYlink written 2.8 years ago by agata88770
3

I work in a mainly wet lab group and I've seen terrible things happening in excel which will likely haunt me forever and result in years of therapy.

ADD REPLYlink written 2.8 years ago by WouterDeCoster38k
1

Using Excel instead of sed in this situation is like buying a whole shed's worth of tools when all you need is a screwdriver.

ADD REPLYlink written 2.8 years ago by spvensko180
1

In addition, you take your hammer out of that shed after buying it and knock the screw in the wood with the handle.

ADD REPLYlink written 2.8 years ago by WouterDeCoster38k

Everybody has an excel (especially students)- so why don't use it :) I can agree that it is maybe not a perfect solution in this situation ...but still a solution.

ADD REPLYlink written 2.8 years ago by agata88770

I used excel to make changes in my vcf file. It doesn't work as vcf file after saving it.

ADD REPLYlink written 2.8 years ago by apocalyps5240

You need to copy columns by yourself to txt file (notebook) and save it with .vcf :) Don't give up it will work :) PS. Don't forget the data with # from your raw vcf to add at the beginning.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by agata88770

I know you mean well but this is just bad advice, can't put it any nicer.

ADD REPLYlink written 2.8 years ago by WouterDeCoster38k
0
gravatar for apocalyps52
2.8 years ago by
apocalyps5240
apocalyps5240 wrote:

Hello,

Thanks for your answer.

As I mentioned, my original reference is UCSC format so the chromosome annotation is "chrX" whereas GRCh38 use only number for chromosome.

How can I overcome this incompatibility ?

Thanks

ADD COMMENTlink written 2.8 years ago by apocalyps5240

Go back to the original data and to it again with the reference you' d like to use now. Or maybe the dbSNP file from the GATK bundle: https://software.broadinstitute.org/gatk/download/bundle

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by Zaag720

@Devon Ryan has done the mapping for you as explained in this post. Just go to his repository to get the mapping from GRCh38 nomenclature to hg38 one.

ADD REPLYlink written 2.8 years ago by Denise - Open Targets4.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 525 users visited in the last hour