The allele with index 2 is not defined in the REF/ALT columns" error after AF filtering with gnomAD Exome
0
0
Entering edit mode
4 months ago

I'm running a germline variant calling pipeline with bcftools, and I annotate/filter variants based on allele frequency from gnomAD exomes (gnomad.exomes.r2.1.1.sites.vcf.bgz). After filtering, I get an error in IGV when loading the VCF:

Error loading features for interval: chr1:114940611-114940651 htsjdk.tribble.TribbleException$InternalCodecException: The allele with index 2 is not defined in the REF/ALT columns in the record

chr1 114940632 . C CA 115.328 PASS INDEL;IDV=32;...;AC=1,1;AN=2;... GT:PL 1/2:176,86,79,72,0,93 This genotype 1/2 suggests two alternate alleles, but only one ALT allele CA is listed.

Here are the key steps I use:

    # 1. Variant calling
bcftools mpileup -Ou -f genome.fa sample.bam |
bcftools call -mv -Ou |
bcftools filter -s LowQual -e 'QUAL<30 || DP<10' -Ov -o sample_raw.vcf

# 2. Compress + index
bgzip sample_raw.vcf && tabix -p vcf sample_raw.vcf.gz

# 3. Annotate with gnomAD (AF)
bcftools annotate \
  -a gnomad.exomes.r2.1.1.sites.vcf.bgz \
  -c CHROM,POS,REF,ALT,INFO/AF \
  sample_raw.vcf.gz -Oz -o sample_annotated.vcf.gz
tabix -p vcf sample_annotated.vcf.gz

# 4. Normalize (split multiallelics)
bcftools norm -m -any sample_annotated.vcf.gz -Oz -o sample_norm.vcf.gz
tabix -p vcf sample_norm.vcf.gz

# 5. Filter by AF and quality
bcftools view -f PASS -i 'INFO/AF<0.05' sample_norm.vcf.gz -Oz -o sample_filtered.vcf.gz
tabix -p vcf sample_filtered.vcf.gz

# 6. (Optionally) Normalize again?
bcftools norm -m -any sample_filtered.vcf.gz -Oz -o sample_final.vcf.gz
tabix -p vcf sample_final.vcf.gz

I ALREADY CHECK THIS: VCF headers are intact (#CHROM, INFO, FORMAT, etc.)

bcftools view doesn't complain about the file.

IGV loads unfiltered/raw VCFs without issues.

Manual checks show many sites with 1/2 genotypes but only one ALT allele.

Normalization (bcftools norm -m -any) doesn't seem to resolve the mismatch in some cases.

I dont know where is my mistake. Any advice is appreciated, especially from anyone who's successfully integrated gnomAD Exome annotations into a clinical/exome pipeline without IGV errors!

gnomAD bcftools vcf • 893 views
ADD COMMENT
0
Entering edit mode

that would be a problem with bcftools norm. What is your version of bcftools ? Furthermore, show us the output of

bcftools view -G --no-header sample_norm.vcf.gz "chr1:114940632-114940632"
ADD REPLY
0
Entering edit mode

I am using bcftools 1.19 Using htslib 1.19

the ouput

chr1 114940632 . C CA 115.328 PASS INDEL;IDV=32;IMF=0.820513;DP=39;VDB=0.108342;SGB=-0.69312;RPBZ=-2.30802;MQBZ=0;MQSBZ=0;BQBZ=-1.75759;SCBZ=0;MQ0F=0;AC=1,1;AN=2;DP4=0,6,0,32;MQ=60;AF=0.000890251

ADD REPLY
0
Entering edit mode

there was already a problem before bcftools nom . Do you still have two alleles in sample_raw.vcf.gz at the same position ? if true you should upgrade bcftools and if the problem persists you should submit a bug report.

ADD REPLY
0
Entering edit mode

by the way, when using bcftools annotate this way, you would overwrite the original AF field with the value of gnomad.

https://samtools.github.io/bcftools/bcftools.html#annotate

The imported VCF annotations can be renamed as "DST_TAG:=SRC_TAG" or "FMT/DST_TAG:=FMT/SRC_TAG".

ADD REPLY
0
Entering edit mode

yes, but I think that I need it in this way because the main idea if filter by the population AF, so I dont need the AF of the sample

ADD REPLY

Login before adding your answer.

Traffic: 4666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6