Hi,
I'm working with exome sequencing data from very early tumours and would like to identify driver genes. I've come across the SomInaClust package that seems well-suited for my needs but am having problems making it recognise my MAF files. So far, I've used maftools to read MAF files into R and that has always worked without any issues. However, when I try to use these MAF files with SomInaClust it produces an error message as it doesn't recognise the columns. I've put in an example below (created by reading in a breast cancer MAF file that comes with the SomInaClust package. This MAF works well with any maftools function but fails with SomInaClust).
I'm a very rookie R user and I'm convinced that it's just a tiny tweak that'll make this run but just cannot seem to make it work. How do I make SomInaClust recognise the columns of the input MAF? Any help is highly appreciated. Thanks a lot!
library(maftools)
library(SomInaClust)
brca <- read.maf(maf_brca) #example MAF from the SomInaClust package
#run an example command using maftools
brca_oncodriveClust <- oncodrive(brca, AACol = "amino_acid_change_WU", minMut = 5, pvalMethod = "zscore")
Estimating background scores from synonymous variants.. Not enough genes to build background. Using predefined values. (Mean = 0.279; SD = 0.13) Estimating cluster scores from non-syn variants.. |=====================================================================================================================================================| 100% Comparing with background model and estimating p-values.. Done !
getFields(brca)
[1] "Hugo_Symbol" "Entrez_Gene_Id" "Center" "NCBI_Build"
[5] "Chromosome" "Start_Position" "End_Position" "Strand"
[9] "Variant_Classification" "Variant_Type" "Reference_Allele" "Tumor_Seq_Allele1"
[13] "Tumor_Seq_Allele2" "dbSNP_RS" "dbSNP_Val_Status" "Tumor_Sample_Barcode"
[17] "Matched_Norm_Sample_Barcode" "Match_Norm_Seq_Allele1" "Match_Norm_Seq_Allele2" "Tumor_Validation_Allele1"
[21] "Tumor_Validation_Allele2" "Match_Norm_Validation_Allele1" "Match_Norm_Validation_Allele2" "Verification_Status"
[25] "Validation_Status" "Mutation_Status" "Sequencing_Phase" "Sequence_Source"
[29] "Validation_Method" "Score" "BAM_File" "Sequencer"
[33] "Tumor_Sample_UUID" "Matched_Norm_Sample_UUID" "Chromosome" "start_WU"
[37] "stop_WU" "reference_WU" "variant_WU" "type_WU"
[41] "gene_name_WU" "transcript_name_WU" "transcript_species_WU" "transcript_source_WU"
[45] "transcript_version_WU" "strand_WU" "transcript_status_WU" "trv_type_WU"
[49] "c_position_WU" "amino_acid_change_WU" "ucsc_cons_WU" "domain_WU"
[53] "all_domains_WU" "deletion_substructures_WU" "transcript_error"
brca_Sominaclust <- SomInaClust_det(brca, calculate_CDS = TRUE, convert_genenames_to_HGNC=FALSE) #trying the file as input for SomInaClust
Error in SomInaClust(maf = maf, database = database, define_clustersize = FALSE, : Make sure the following columns are present in the maf file (column names need to be exact): Hugo_Symbol, Tumor_Sample_Barcode, Variant_Classification, Start_Position
sessionInfo()
R version 3.4.0 (2017-04-21)
attached base packages: [1] parallel stats graphics grDevices utils datasets methods base
other attached packages: [1] SomInaClust_1.0.0 maftools_1.2.30 Biobase_2.36.2 BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] nlme_3.1-131 bitops_1.0-6 matrixStats_0.52.2 bit64_0.9-7 doParallel_1.0.11
[6] RColorBrewer_1.1-2 prabclus_2.2-6 GenomeInfoDb_1.12.3 tools_3.4.0 R6_2.2.2
[11] DBI_0.7 lazyeval_0.2.1 colorspace_1.3-2 trimcluster_0.1-2 nnet_7.3-12
[16] GetoptLong_0.1.6 gridExtra_2.3 bit_1.1-12 compiler_3.4.0 DelayedArray_0.2.7
[21] pkgmaker_0.22 labeling_0.3 slam_0.1-40 rtracklayer_1.36.6 diptest_0.75-7
[26] scales_0.5.0 DEoptimR_1.0-8 mvtnorm_1.0-6 robustbase_0.92-8 NMF_0.20.6
[31] stringr_1.2.0 digest_0.6.12 Rsamtools_1.28.0 cometExactTest_0.1.3 XVector_0.16.0
[36] pkgconfig_2.0.1 changepoint_2.2.2 BSgenome_1.44.2 rlang_0.1.4 GlobalOptions_0.0.12
[41] RSQLite_2.0 shape_1.4.3 bindr_0.1 zoo_1.8-0 mclust_5.4
[46] BiocParallel_1.10.1 DPpackage_1.1-7.1 dendextend_1.6.0 dplyr_0.7.4 VariantAnnotation_1.22.3
[51] RCurl_1.95-4.8 magrittr_1.5 modeltools_0.2-21 GenomeInfoDbData_0.99.0 wordcloud_2.5
[56] Matrix_1.2-11 Rcpp_0.12.14 munsell_0.4.3 S4Vectors_0.14.7 viridis_0.4.0
[61] stringi_1.1.6 whisker_0.3-2 MASS_7.3-47 SummarizedExperiment_1.6.5 zlibbioc_1.22.0
[66] flexmix_2.3-14 plyr_1.8.4 grid_3.4.0 blob_1.1.0 ggrepel_0.7.0
[71] lattice_0.20-35 cowplot_0.9.1 Biostrings_2.44.2 splines_3.4.0 GenomicFeatures_1.28.5
[76] circlize_0.4.2 ComplexHeatmap_1.14.0 GenomicRanges_1.28.6 rjson_0.2.15 fpc_2.1-10
[81] rngtools_1.2.4 biomaRt_2.32.1 reshape2_1.4.2 codetools_0.2-15 stats4_3.4.0
[86] XML_3.98-1.9 glue_1.2.0 data.table_1.10.4-3 foreach_1.4.3 gtable_0.2.0
[91] kernlab_0.9-25 assertthat_0.2.0 ggplot2_2.2.1 gridBase_0.4-7 xtable_1.8-2
[96] class_7.3-14 survival_2.41-3 viridisLite_0.2.0 tibble_1.3.4 iterators_1.0.8
[101] GenomicAlignments_1.12.2 AnnotationDbi_1.38.2 registry_0.5 memoise_1.1.0 IRanges_2.10.5
[106] bindrcpp_0.2 cluster_2.0.6
Hi, You're passing MAF object as an input for
SomInaClust_detfunction. My guess is it requires MAF file as an input. May be try this and see...This gives the following error message.
Hi, Could you solve your problems of SomInaClust package? I try to use SomInaClust but I got same errors. I could not solve "argument "parameter_default" is missing, with no default" error. Could you please share your fixed commands?