Question: Error when running Genome MuSiC: no smg genes
0
gravatar for C Wang
7 months ago by
C Wang0
C Wang0 wrote:

Hi,

I am trying to use genome music to generate smg genes in a normal-tumor paired WES data. The workflow is:

1 using Verscan to get variant in vcf format

2 using Vcf2maf to transfer the vcf output to maf

3 bmr calc-covg

4 bmr calc-bmr

5 smg

It seems that the 1-4 steps runs properly but there is an error when I try the last step:


*Result/Cat_Group/Music/APPF008/Result/6/roi_covgs/APPF008.covg generated and stored.

APPF008.covg generated and stored to*Result/Cat_Group/Music/APPF008/Result/6/roi_covgs.

STATUS: Running VEP and writing to: /home/suozhen/data/RNA-*Result/Cat_Group/Music/APPF008/Input/VCF/6.APPF008.trans.vep.vcf

2017-02-24 15:15:52 - Read existing cache info

2017-02-24 15:15:52 - Starting...

2017-02-24 15:15:54 - Read 5000 variants into buffer

2017-02-24 15:15:54 - Calculating consequences

2017-02-24 15:16:00 - Writing output

2017-02-24 15:16:00 - Processed 5000 total variants (625 vars/sec, 625 vars/sec total)

2017-02-24 15:16:03 - Read 5000 variants into buffer

2017-02-24 15:16:03 - Calculating consequences

2017-02-24 15:16:09 - Writing output

2017-02-24 15:16:09 - Processed 10000 total variants (556 vars/sec, 588 vars/sec total)

2017-02-24 15:16:11 - Read 5000 variants into buffer

2017-02-24 15:16:11 - Calculating consequences

2017-02-24 15:16:17 - Writing output

2017-02-24 15:16:17 - Processed 15000 total variants (625 vars/sec, 600 vars/sec total)

2017-02-24 15:16:19 - Read 5000 variants into buffer

2017-02-24 15:16:19 - Calculating consequences

2017-02-24 15:16:26 - Writing output

2017-02-24 15:16:26 - Processed 20000 total variants (556 vars/sec, 588 vars/sec total)

2017-02-24 15:16:27 - Read 2156 variants into buffer

2017-02-24 15:16:27 - Calculating consequences

2017-02-24 15:16:32 - Writing output

2017-02-24 15:16:32 - Processed 22156 total variants (359 vars/sec, 554 vars/sec total)

2017-02-24 15:16:32 - Finished!

Loading per-sample coverages stored in *Result/Cat_Group/Music/APPF008/Result/6/total_covgs

Loading per-gene coverage files stored under *Result/Cat_Group/Music/APPF008/Result/6/gene_covgs/

Running 'joinx ref-stats' to read reference FASTA and identify SNVs at AT, CG, CpG sites

Parsing MAF file to classify mutations

Finished Parsing the MAF file to classify mutations

Skipped 1054 mutation(s) that belong to unrecognized samples

Error in xy.coords(x, y, xlabel, ylabel, log) :

'x' and 'y' lengths differ

Calls: plot -> plot.default -> xy.coords

stop running

Error in xy.coords(x, y, xlabel, ylabel, log) :

'x' and 'y' lengths differ

Calls: plot -> plot.default -> xy.coords

stop running


No result is written to the output smgs_varscan_tumor, and another output smgs_varscan_tumor_detailed looks like:


Gene Indels SNVs Tot Muts Covd Bps Muts pMbp P-value FCPT P-value LRT P-value CT FDR FCPT FDR LRT FDR CT Expression

CYP21A1P 0 0 0 1264 0.00 1 1 1 1 1 1 expressed

MAPK14 0 0 0 2632 0.00 1 1 1 1 1 1 expressed

TAF8 0 0 0 2421 0.00 1 1 1 1 1 1 expressed

TEAD3 0 0 0 3096 0.00 1 1 1 1 1 1 expressed

TRIM31 0 0 0 1725 0.00 1 1 1 1 1 1 expressed


The mutation information in maf file is not read by genome music smg. Can someone please help? Thank you so much!!

----------------the maf header looks like:

version 2.4

Hugo_Symbol Entrez_Gene_Id Center NCBI_Build Chromosome Start_Position End_Position Strand Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS dbSNP_Val_Status Tumor_Sample_Barcode Matched_Norm_Sample_Barcode Match_Norm_Seq_Allele1 Match_Norm_Seq_Allele2 Tumor_Validation_Allele1 Tumor_Validation_Allele2 Match_Norm_Validation_Allele1 Match_Norm_Validation_Allele2 Verification_Status Validation_Status Mutation_Status Sequencing_Phase Sequence_Source Validation_Method Score BAM_File Sequencer Tumor_Sample_UUID Matched_Norm_Sample_UUID HGVSc HGVSp HGVSp_Short Transcript_ID Exon_Number t_depth t_ref_count t_alt_count n_depth n_ref_count n_alt_count all_effects Allele Gene Feature Feature_type Consequence cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation ALLELE_NUM DISTANCE STRAND_VEP SYMBOL SYMBOL_SOURCE HGNC_ID BIOTYPE CANONICAL CCDS ENSP SWISSPROT TREMBL UNIPARC RefSeq SIFT PolyPhen EXON INTRON DOMAINS GMAF AFR_MAF AMR_MAF ASN_MAF EAS_MAF EUR_MAF SAS_MAF AA_MAF EA_MAF CLIN_SIG SOMATIC PUBMED MOTIF_NAME MOTIF_POS HIGH_INF_POS MOTIF_SCORE_CHANGE IMPACT PICK VARIANT_CLASS TSL HGVS_OFFSET PHENO MINIMISED ExAC_AF ExAC_AF_AFR ExAC_AF_AMR ExAC_AF_EAS ExAC_AF_FIN ExAC_AF_NFE ExAC_AF_OTH ExAC_AF_SAS GENE_PHENO FILTER flanking_bps variant_id variant_qual ExAC_AF_Adj ExAC_AC_AN_Adj ExAC_AC_AN ExAC_AC_AN_AFR ExAC_AC_AN_AMR ExAC_AC_AN_EAS ExAC_AC_AN_FIN ExAC_AC_AN_NFE ExAC_AC_AN_OTH ExAC_AC_AN_SAS ExAC_FILTER

-----------ROI------------------------------------------

6 105929 106835 OR4F1P

6 292465 292642 DUSP22

------------music command-----------------------------

1

genome music bmr calc-covg --bam-list */6.APPF008.bam.list --output-dir */Result/6 --reference-sequence */6.fasta --roi-file */6.bed --gene-covg-dir */6

2

*/vcf2maf-master/vcf2maf.pl --input-vcf */6.APPF008.trans --output-maf */6.APPF008.trans.maf --tumor-id 'TUMOR' --normal-id 'NORMAL' --vcf-tumor-id 'TUMOR' --vcf-normal-id 'NORMAL' --vep-path */Music/data_vep --filter-vcf */Music/data_vep/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz --ref-fasta */Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz

3

*/Music/MuSiC2-master/bin/music2 bmr calc-bmr --bam-list */6.APPF008.bam.list -maf-file */6.APPF008.trans.maf --reference-sequence */6.fasta --roi-file */6.bed --output-dir */6

4

*/Music/MuSiC2-master/bin/music2 smg --gene-mr-file */Result/6/gene_mrs --output-file */Result/6/smgs_varscan_tumor

ADD COMMENTlink modified 3 months ago by Chris Miller18k • written 7 months ago by C Wang0
0
gravatar for Chris Miller
3 months ago by
Chris Miller18k
Washington University in St. Louis, MO
Chris Miller18k wrote:

I don't know your exact issue, but this seems to be the key line in the output:

Skipped 1054 mutation(s) that belong to unrecognized samples

Check that your sample names all match up as expected.

ADD COMMENTlink modified 3 months ago • written 3 months ago by Chris Miller18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 658 users visited in the last hour