Question: GATK dbSNP for Ensembl
0
gravatar for bharata1803
4.0 years ago by
bharata1803420
Japan
bharata1803420 wrote:

Hello,

I want to do some SNP calls from exome sequencing data and I found really good documentation in seqanswer. I have one question about the step for quality score recalibration in GATK (http://seqanswers.com/wiki/How-to/exome_analysis#Quality_score_recalibration) which use dbSNP. From the tutorial, it shows the data from UCSC. Currently, I use ensemble GRCh38 for my genome reference so I have question about that. Can I use UCSC dbSNP for my aligned to Ensemble data? I also check Ensemble FTP and found this link ftp://ftp.ensembl.org/pub/release-79/variation/vcf/homo_sapiens/ and ftp://ftp.ensembl.org/pub/release-79/variation/gvf/homo_sapiens/. So, which one I should use because the tutorial use txt file from UCSC (I checked the UCSC the filetype still txt). Thank you for your answer.

ensembl snp gatk • 1.9k views
ADD COMMENTlink modified 8 months ago by zx87547.3k • written 4.0 years ago by bharata1803420
1
gravatar for Max Ivon
4.0 years ago by
Max Ivon110
Russian Federation
Max Ivon110 wrote:

You should use this file ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b142_GRCh38/VCF/All.vcf.gz for your purpose if you have aligned reads on GRCh38 version of the genome. But im not sure that you use the right one guide for the score recalibration. According to this post http://gatkforums.broadinstitute.org/discussion/1248/countcovariates, countcovariates tools is no longer supported by GATK. For the base quality recalibration it is recommended to use BaseRecalibrator and after snp calling for exome data (or WGS) it is recommended to perform automatic variant quality recalibration with VariantRecalibrator (not with VariantFiltration as said on seqanswers). You can find documentation directly on GATK site, which is really good.

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by Max Ivon110

Thank you. I'm using it right now but I want to ask about something. The description of BaseRecalibartor is like this : This tool is designed to work as the first pass in a two-pass processing step. So, what is the second pass? I can not find the second step of this and I checked the CountCovariates and tableRecalibrator from the Seqanswer tutorial is no longer exist.

ADD REPLYlink written 4.0 years ago by bharata1803420

This may be useful https://www.broadinstitute.org/gatk/guide/article?id=44, https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_readutils_PrintReads.php. Firstly you use BaseRecalibraotor, which generates .grp table and then you use PrintReads with -BQSR argument.

ADD REPLYlink written 4.0 years ago by Max Ivon110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1564 users visited in the last hour