Compare allele frequency of SNPs in my data to their allele frequency in different human populations - ExAC vs 1000 Genomes vs Other?
0
0
Entering edit mode
3.4 years ago
gaelgarcia05 ▴ 260

Hello,

I have a list of SNPs (in the form of a VCF) found in our [very] targeted sequencing dataset of ~15,000 individuals.

I am looking to compare the MAFs of these SNPs within this 'population' (our dataset) to their MAFs across different populations, such as the populations defined in ExAC or the 1000 Genomes Project.

Is there an effective way that you would recommend to do this?

These samples were processed using GrCh38 — I believe the ExAC variants have coordinates based on the previous build (please correct me if I'm wrong), so I'm unsure about using the MAFs from the ExAC data.

The output table I have in mind would look something like this:

snpID  MAF_mysamples  MAF_european  MAF_finnish  MAF_african  MAF_se_asian  MAF_asian


As always, your input is greatly appreciated.

snp ExAC 1000Genomes MAF sequencing • 1.6k views
2
Entering edit mode

Hello,

Ensembl have a lifted over version for 1000 Genomes, ExAC and gnomAD exomes for hg38. Have a look at this ftp directory.

You could use this for annotating your vcf and extract than all the information in a way you like. How should your final output look like?

fin swimmer

0
Entering edit mode

Thanks @finswimmer - just updated my post to clarify what I'm looking for as output.

The output table I have in mind would look something like this:

snpID  MAF_mysamples  MAF_european  MAF_finnish  MAF_african  MAF_se_asian  MAF_asian

2
Entering edit mode

Just use ANNOVAR, as it outputs allele frequencies for all of these populations, and it supports hg38. It even has a function that converts VCF to the format required for ANNOVAR, to assist you.

Regarding allele frequencies in your own sample cohort, you can just calculate the AF (allele frequency) INFO tag and encode it directly into your VCF using BCFtools: How to use bcftools to calculate AF INFO field from AC and AN in VCF?

To then extract the AF in an 'easy' format, use BCFtools query, something like: A: Extracting certain columns from VCF file