Aggregate multiple VCFs into a single MAF or generate a multi-sample MAF from a multi-sample VCF
Entering edit mode
4.5 years ago
Luca Beltrame ▴ 230

Note: This is different from the many vcf2maf questions posted around Biostars.

Like many others, I've been in need to generate a MAF file for use in downstream applications like maftools or TRONCO. My final goal is to obtain a multi-sample MAF file like the ones shipped by TCGA.

However, I can't use vcf2maf, because it requires VCFs to only have 2 samples. I have either:

  • Several individual VCFs (1 per sample)
  • One single VCF with information from all samples (with the multiallelic loci decomposed)

The first case would fit vcf2maf, however there is no indication on how to merge multiple MAF files into a single one. And the second one is totally incompatible with vcf2maf due to the use of multiple samples.

The GDC Data Portal docs mention an aggregation workflow which starts from VCFs and ends in MAF, however I've poked around in the GDC sources and I couldn't find anything.

I know this is possible because TCGA does it: but are there any tools to perform this task?

sequencing file-formats maf conversion • 2.9k views
Entering edit mode

have you figured it out

Entering edit mode
12 months ago

See problems with MAF for MutSigCV (vcf2maf) for code showing how to:

  • split a multi-sample VCF into per-sample VCFs using vcftools
  • combine single-sample MAFs into one MAF using unix cat/grep

These features were not added to vcf2maf to avoid scope creep. vcf2maf was always intended to be used in a workflow where other popular VCF or MAF manipulating tools are available.

Entering edit mode

Thanks. Eventually I ended up reading vcf2maf's source code and reimplementing what I needed (not from a VCF, but from a database I store the variant data in) using the hints from there.


Login before adding your answer.

Traffic: 2129 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6