Question: How to get minor allele frequency of a list of variants; for example NM_004401 (c.T122>G) from 1000 genome project
1
gravatar for sudhir07jnu
4.4 years ago by
sudhir07jnu30
United States
sudhir07jnu30 wrote:

Hi friends !!

Can some body help me in getting MAF of variants of different genes from 1000 genome project data? The list of variants is in the given format: 

Refseq_mRNA_Id    cDNA_Change
NM_004401    c.967C>T
NM_005401    c.919C>T
NM_004401    c.736A>G
NM_004401    c.560_563delAGTC
NM_002201    c.557G>A
NM_004401    c.385A>G
NM_004401    c.199G>T
NM_004401    c.148C>G
NM_022768    c.65T>C

I need to get the MAF for all these variants.

Thanks in advance !!

1000 snp genome maf • 3.5k views
ADD COMMENTlink modified 4.4 years ago by Cyriac Kandoth5.3k • written 4.4 years ago by sudhir07jnu30
1
gravatar for Cyriac Kandoth
4.4 years ago by
Cyriac Kandoth5.3k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.3k wrote:

If you replace those Refseq transcript IDs with HUGO gene names or Ensembl transcript IDs, you can create an HGVS formatted variant list for Ensembl's Variant Effect Predictor, which reports minor allele frequencies from 1000 genomes and NHLBI ESP. Here's 7 of your 9 variants in HGVS format with HUGO symbols:

DFFA:c.967C>T
DFFA:c.736A>G
DFFA:c.560_563delAGTC
DFFA:c.385A>G
DFFA:c.199G>T
DFFA:c.148C>G
RBM15:c.65T>C

Visit the VEP web interface, paste the data above into the big empty box, and hit Run. MAFs will be reported where available. Your following 2 variants will fail in VEP since they use reference alleles that mismatch what's in the actual transcript, so there's something wrong there:

PTPN14:c.919C>T
ISG20:c.557G>A

In general, it's always best to use genomic coordinates for this kind of annotation, but they might not always be provided in publications. The bare minimum you need is chromosome, position, reference_allele, and variant_allele. If you have a VCF (Variant Call Format) file, then try vcf2maf that runs VEP on all variants to generate a Mutation Annotation Format (MAF) file, a tab-delimited format which will include the Minor Allele Frequency (MAF) data and lots of additional useful annotations listed here Note that the "maf" in vcf2maf refers to Mutation Annotation Format.

ADD COMMENTlink modified 20 months ago • written 4.4 years ago by Cyriac Kandoth5.3k
1

You can use RefSeq IDs for HGNC input into the VEP.

ADD REPLYlink written 4.4 years ago by Emily_Ensembl18k

Thank you so much Cyriac Kandoth. This is of great help to me.

ADD REPLYlink modified 4.4 years ago • written 4.4 years ago by sudhir07jnu30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 951 users visited in the last hour