GetPileupSummaries input reference - exomeseq
0
0
Entering edit mode
3.7 years ago
wiki97 ▴ 10

Hi, I want to run GetPileupSummaries (GATK 4.1.8.1) on BAM file aligned to hg38. Could you recommend me which vcf reference file should I use?

For the co-cleaning step with GATK RealignerTargetCreator as "known_indels.vcf" file I used reference from NCBI repository: https://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/

I wanted to use it for GetPileupSummaries as well, but I got the error: "A USER ERROR has occurred: Bad input: Population vcf does not have an allele frequency (AF) info field in its header."

I checked and indeed this dbSNP common vcf file doesn't have the AF field - I found the information that allele frequencies are reported now in the CAF tag. (info from here: https://www.ncbi.nlm.nih.gov/variation/docs/oldglossary_dbSNP1_vcf/)

Should I modify the reference to obtain AF from CAF field?

Generally, for GetPileupSummaries it is suggested to use gnomAD as a reference, so I downloaded reference file from here: https://gnomad.broadinstitute.org/downloads/

But I got the error: "A USER ERROR has occurred: Input files reference and features have incompatible contigs: Found contigs with the same name but different lengths."

What would you suggest me to do? I will be grateful for any help, thank you!

GATK hg38 GetPileupSummaries • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 2008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6