Question: Music Analysis Workflow
9
gravatar for Pascal
4.8 years ago by
Pascal250
European Union
Pascal250 wrote:

Hello there,

I am pretty new to the cancer field and now try to run MuSiC for WashU to detect significantly mutated genes in our exome-seq samples. Since I am not sure if I am doing the right things I would appreciate if someone can take a look at what tools I am using. Probable there is a lot room for improvements. So here are the steps I am running to get significantly mutated genes:

  1. FastQ files
  2. BWA map to reference genome
  3. Mark duplicates by picard tools
  4. Use GATK to InDel realign and base recalibration
  5. samtools mpileup for each normal and tumor sample
  6. Somatic variation calling. I use VarScan somatic for this.
  7. snpEff to annotate the mutations and their consequences
  8. Convert VCF file from previous step to MAF file and filter for consequences that very likely change the gene function (e.g. missense)
  9. Combine filtered MAF files from all samples
  10. Run MuSiC bmr calc-covg
  11. Run MuSiC bmr calc-bmr
  12. Run MuSiC smg

Especially step 8 and 9 seem to be tricky. I am not sure if I am missing some straight forward solution from calling mutations to MAF files.

Thanks

exome-sequencing music snp cancer • 2.7k views
ADD COMMENTlink modified 4.8 years ago by Cyriac Kandoth5.2k • written 4.8 years ago by Pascal250
5
gravatar for Cyriac Kandoth
4.8 years ago by
Cyriac Kandoth5.2k
Memorial Sloan Kettering, New York, USA
Cyriac Kandoth5.2k wrote:

Looks good. In step 8, do not "filter for consequences that very likely change the gene function". That's too stringent, and you might miss something novel or non-coding. Rather, reduce the noise from false-positive variants using tools like this. MuSiC's calc-bmr step will exclude Silent (synonymous) SNVs by default. Steps 7 and 8 can be solved using this script. And step 9 shouldn't be tricky... simply concatenate the MAFs.

If any of the resulting SMGs (significantly mutated genes) don't make sense, then take a closer look at their variants. This is a good way to weed out recurrent false-positives - usually germline calls that are incorrectly called somatic for reasons like amplification bias, or artifacts in the reference sequence like misplaced paralogs. You can also try calc-bmr with an option called --separate-truncations, which prioritizes truncating variants in the math. "Truncations" include frame-shift, nonsense, and splice-site mutations.

ADD COMMENTlink written 4.8 years ago by Cyriac Kandoth5.2k

Hi Cyriac,

Sorry for asking this as a comment. I have data from 50 patients, all are from targetted capture of around 120 genes. My question is..  is it ok to use MuSic for targeted capture data ? I guess its would be biased to study smg's from targeted capture, just wanted an opinion on this.

Thank you.

ADD REPLYlink written 4.4 years ago by poisonAlien2.6k
1

Yea, that's totally fine. MuSiC's SMG test was meant to shortlist genes significantly altered in exome-seq, so that you could then target them for capture on larger cohorts. But when your ROI file (regions of interest) lists only about 120 genes, then the SMG test will at least help you rank them in order of significance.

ADD REPLYlink written 4.4 years ago by Cyriac Kandoth5.2k

Hi Cyriac,

I have mutation calls on both, set of matched tumor-normal pairs and tumor-only samples (using panel of normal approach). I doubt if music2 can be used for calling mutational significance for tumor-only samples but if you have any suggestions to do so otherwise or use comparable tools, like oncodriveFM or others, that would be of help.

Thanks, Samir

ADD REPLYlink modified 5 months ago • written 5 months ago by Samir130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1136 users visited in the last hour