Anyone Has A Working Example On How To Run BROAD's ABSOLUTE On Exome Sequencing Data?
7
8
Entering edit mode
8.0 years ago
Christian ★ 3.0k

It appears to me that ABSOLUTE from the Broad can be run using coverage and allelic-frequency data obtained from whole-exome sequencing.

However, from the documentation provided by the BROAD (http://www.broadinstitute.org/cancer/cga/absolute_run) it is not clear to me how this can be accomplished.

Can anyone provide a working code example, ideally with example input data?

cancer exome-sequencing copynumber • 12k views
ADD COMMENT
0
Entering edit mode

I just downloaded the package, and could not find documentation on this either. It may be worth trying to contact the package maintainer, Jeff Gentry jgentry@broadinstitute.org as listed in the description file.

ADD REPLY
0
Entering edit mode

Presumably this is too late to be helpful but in case anyone else comes across this - I haven't had anything to do with absolute for a very long time. You'll probably want to contact Scott Carter (scarter@broadinstitute.org) with any issues.

ADD REPLY
0
Entering edit mode

Hi Christian, are you finally successful in running ABSOLUTE?

ADD REPLY
0
Entering edit mode
Haven't tried further after the mixed answers here, but still interested in a solution.
ADD REPLY
0
Entering edit mode

Has anyone tried to this version of Capseg?

https://github.com/aaronmck/CapSeg

ADD REPLY
0
Entering edit mode

It seems like there is no instruction for running CapSeg.

ADD REPLY
0
Entering edit mode

There are no decent instructions for running most software developed by the Broad Institute. They should really read the edgeR vignette sometime, to see how it's done.

ADD REPLY
0
Entering edit mode

You mean the software which is incomplete and hasn't been maintained in the past 3 years? No.

ADD REPLY
5
Entering edit mode
7.9 years ago

See How To Use Maf Files In The Absolute Software, for how to create/format your MAF file (add two columns t_ref_count and t_alt_count listing read counts for REF and ALT alleles). Here is a sample MAF.

To generate the segmentation file with copy-number variants, you can use CNV callers that compare the tumor coverage to the normal coverage, to infer relative somatic copy-number events. For example, VarScan's copynumber.

ADD COMMENT
1
Entering edit mode

@Cyriac I modified your vcf2maf script to add these and some more columns. The VCF file needs to be annotated with snpEff, and further has to have the AD tag set for both tumor and normal genotypes. https://github.com/dakl/vcf2maf

ADD REPLY
0
Entering edit mode

Thanks. I've now added tons of features to vcf2maf, including new biotype/effect priorities, support for multi-allelic sites, genotypes, read-depths, etc. Detailed changelogs are in versioned releases down here: https://github.com/ckandoth/vcf2maf/releases

ADD REPLY
0
Entering edit mode

Hi Cyriac,

Just a quick question: After doing segmentation (DNAcopy) of varscan2 copynumber output, do we need to convert log2 segment mean to normal values before inputting into RunAbsolute ? Thanks much.

ADD REPLY
0
Entering edit mode

Though it may seem like it, I've never actually used Absolute. :) You could skim through the code of RunAbsolute to find out whether it needs log-ratio or CN values.

ADD REPLY
4
Entering edit mode

Just checked it. Turns out we dont need to convert. It will convert on the go..

## Convert from base 2 log
copy_num = 2^(segs_tab[, "Segment_Mean"] )

ADD REPLY
4
Entering edit mode
7.8 years ago

You don't need the MAF file to run ABSOLUTE, if you have the MAF it will be used with the detected copy number and cellularity to compute the cancer cell fraction of the mutations. Useful to infer sample heterogeneity.

I found the current state of ABSOLUTE a bit lacking for exome, it does not provide allele specific copy number, and it does not correlate well with results of SNParray of the same samples.

I know that a newer "component" of ABSOLUTE, called CapSeg, will be available shortly. It will take care of the segmentation in an alleles specific way, similarly to how ABSOLUTE handle the SNP array data. However I've only seen it cited in papers, not real sign of it.

So far the best approach is to use VarScan2, segment the resulting binned depth ratio with DNAcopy or copynumber from bioconductor and then feed the segments to ABSOLUTE.

You could try also a different sofware: Sequenza. It returns allele specific copy number detecting cellularity and ploidy.

ADD COMMENT
0
Entering edit mode

I ran varscan2 copycaller and bin results with dnacopy and sometimes it results in over 3000 segments, the default setting on absolute is 1500. Do you filter out some segments with low number of markers or merge the segments with varscan helper script mergeSegments.pl or simply increase the threshold?

ADD REPLY
1
Entering edit mode
8.0 years ago
gsaksena ▴ 10

The version of ABSOLUTE that runs on whole exome data has not yet been published or released.

ADD COMMENT
1
Entering edit mode
7.9 years ago

If you are trying to call copy number variants in exome or targeted-gene sequencing, I have found CoNIFER (or CoNIFER + DNAcopy) has worked with (and it has pretty clear documentation):

http://conifer.sourceforge.net/

Based upon the response from gsaksena, I am assuming this may be a more practical option.

ADD COMMENT
1
Entering edit mode
7.2 years ago
Eric T. ★ 2.7k

For future reference, the program THetA is designed to infer tumor purity, subclone cellularity, and the absolute copy number of segments in the tumor cell clonal/subclonal populations. The program itself is simple to use: The inputs are the tumor BAM, normal BAM, and copy number segments inferred by a third-party program of your choice.

ADD COMMENT
0
Entering edit mode
does it work with exome seq data?
ADD REPLY
0
Entering edit mode

Yes. You can also use CNVkit as a shortcut through the process to call CNV segments, generate the THetA2 input file, and pull the subclone-specific CNV calls back into a reusable format for further analysis.

But note that there are more recent programs such as PyClone and BubbleTree that appear to perform better on benchmarks now. Still, THetA2 can get you a usable result fairly quickly.

ADD REPLY
0
Entering edit mode
7.7 years ago
ivivek_ngs ★ 5.1k

Dear All,

I have exome data for my tumor and match normals. I also have list of the variants with the frequency of the ALT ALLELE for both tumor and normal samples. I want to use ABSOLUTE just to understand the tumor purity percentage and then use that information for somatic variant calling using VarScan, can anyone suggest a working example how to use it as it is not perfectly clear from the Broad webpage how to use it , does anyone has some working example as I cant find one, I just want to understand the tumor purity for my samples. I would be glad if someone can help.

ADD COMMENT
0
Entering edit mode
12 months ago
dv.sobral • 0

I think the genepattern help page for absolute can help: https://www.genepattern.org/modules/docs/ABSOLUTE

ADD COMMENT

Login before adding your answer.

Traffic: 1779 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6