You can extract allele depth (AD) and read depth (DP) using the VariantAnnotation package that you mentioned. The readVcf function imports the VCF as a VCF object, and you can access these fields directly via the geno slot if they are present in the FORMAT section. For standard formats like GT:AD:AF:DP:F1R2:F2R1:FAD:SB, AD provides comma-separated depths for the reference and alternate alleles, while DP gives the total read depth.
Here is example code to extract them:
library(VariantAnnotation)
# Read the VCF file
vcf <- readVcf("your_file.vcf", "hg38")
# Extract allele depths (AD) as a matrix or array
ad <- geno(vcf)$AD
# Extract read depths (DP) as a matrix
dp <- geno(vcf)$DP
# View dimensions and first few entries
dim(ad)
head(ad[, 1:2]) # First variants, first two samples
dim(dp)
head(dp[, 1:2])
For non-standard formats like DP:FDP:SDP:SUBDP:AU:CU:GU:TU, these represent base-specific unfiltered and filtered counts (e.g., AU for A, with comma-separated values possibly indicating strands or qualities). The VariantAnnotation package still imports them via geno, but you must map them to AD based on the reference (REF) and alternate (ALT) alleles for each variant. Extract the relevant fields and compute allele depths manually.
Here is code to handle this:
# Extract base-specific fields
au <- geno(vcf)$AU
cu <- geno(vcf)$CU
gu <- geno(vcf)$GU
tu <- geno(vcf)$TU
# For each variant, map REF/ALT to depths (example for first sample)
ref_alt_depths <- sapply(seq_len(nrow(vcf)), function(i) {
ref_base <- as.character(rowRanges(vcf)$REF[i])
alt_bases <- as.character(rowRanges(vcf)$ALT[i])
ref_depth <- switch(ref_base, A=au[i,1], C=cu[i,1], G=gu[i,1], T=tu[i,1])
alt_depths <- sapply(alt_bases, function(alt) switch(alt, A=au[i,1], C=cu[i,1], G=gu[i,1], T=tu[i,1]))
c(ref_depth, alt_depths)
})
This approach accommodates multiple alleles. If your VCFs vary widely, consider the vcfR package for additional flexibility in parsing FORMAT fields via extract.gt(vcf, element="AD").
Kevin
bcftools query -f '[%CHROM %POS %REF %ALT %SAMPLE %GT %AD\n]'