NMF not working for identifying mutational signature
2
1
Entering edit mode
3.0 years ago
dodausp ▴ 150

Hi, everyone Could anybody help me with this issue: I am trying to find the most fitting number of mutational signatures that best defines my samples by using the NMF. This is how the matrix looks like:

mut_con
patient 1 patient 2 patient 3 patient 4 patient 5 patient 6 patient 7
A[C>A]A   534.001   176.001   493.001  1392.001  1083.001   263.001  1174.001
A[C>A]C   725.001   196.001   417.001  1372.001   936.001   340.001  1068.001
A[C>A]G   223.001    62.001   316.001  1013.001   773.001    98.001   524.001
A[C>A]T   452.001   278.001   255.001   931.001   645.001   198.001   862.001
C[C>A]A   650.001   406.001   719.001  2690.001  2346.001   272.001  2129.001
C[C>A]C   641.001   533.001   555.001  2527.001  1652.001   269.001  2229.001


And here is my line to try to get a best NMF estimate:

nmf_patient <- nmf(mut_con, rank=1:3, method="brunet", nrun=10, seed=123456)


And this is what I get:

Timing stopped at: 2.303 0.274 2.51
Timing stopped at: 2.263 0.238 2.414
Timing stopped at: 2.244 0.233 2.397
Error in (function (...)  : All the runs produced an error:
-#1 [r=1] -> NMF::nmf - 10/10 fit(s) threw an error.
# Error(s) thrown:
- run #1: unused arguments (model = list("NMFstd", 1, 0), method = "random")
-#2 [r=2] -> NMF::nmf - 10/10 fit(s) threw an error.
# Error(s) thrown:
- run #1: unused arguments (model = list("NMFstd", 2, 0), method = "random")
-#3 [r=3] -> NMF::nmf - 10/10 fit(s) threw an error.
# Error(s) thrown:
- run #1: unused arguments (model = list("NMFstd", 3, 0), method = "random")


Would anyone have any guess and suggestion on what is the issue here. Would there be any conflict between packages here, or am I doing something wrong?

Any help will be greatly appreciated!

Here is the sessionInfo():

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS:     /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] C

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] SomaticSignatures_2.16.0   VariantAnnotation_1.26.1   SummarizedExperiment_1.10.1
[4] doParallel_1.0.14          iterators_1.0.10              foreach_1.4.4
[7] ggplot2_3.0.0              BSgenome.Hsapiens.UCSC.hg19_1.4.0 BSgenome_1.48.0
[10] rtracklayer_1.40.6         SomaticCancerAlterations_1.16.0 Rsamtools_1.32.3
[13] Biostrings_2.48.0          XVector_0.20.0                  DelayedArray_0.6.6
[16] BiocParallel_1.14.2        matrixStats_0.54.0              MutationalPatterns_1.6.1
[19] NMF_0.21.0                 cluster_2.0.7-1                 rngtools_1.3.1
[22] pkgmaker_0.27              registry_0.5                    GenomicRanges_1.32.7
[25] GenomeInfoDb_1.16.0        IRanges_2.14.12                 S4Vectors_0.18.3
[28] Biobase_2.40.0             BiocGenerics_0.26.0

NMF mutational signature ranking SNP matrix • 1.9k views
3
Entering edit mode
3.0 years ago
venu 6.9k

Sorry to disappoint you but it's not that simple. NMF is a good approach for mutational signatures but where did you see that approach/tutorial?

You need a reference set of signatures (COSMIC 30 or PCAWG 60, google it) and a mutational catalogue (the matrix you have now). There are several established tools (e.g. YAPSA, signeR, somaticSignatures ...etc) for COSMIC 30 signatures but these are soon to be outdated as people already published that there are more mutational processes in cancers. Try to use already available tools.

0
Entering edit mode

Hi, venu

No disappointment at all. I was afraid the workaround wouldn't be so simple.

In regards to your comments, I usually use somaticSignatures. In that case, I normally use the function assessNumberSignatures (either with nmf or pca decomposition) to discover the optimal # of signatures, or identifySignatures if I already have a number of signatures I want to break my samples into. In both cases, it works just fine. However, I am afraid that with somaticSignatures I am not able to compare my mutational catalogue with COSMIC 30. And that is exactly what I'd like to do. So, I found this package MutationalPatterns that provides this option on a very tidy way. Hence, I would like to try on my samples. The thing is that I am getting stuck on the nmf function.

I guess one thing I could try is to do the nmf deconvolution on somaticSignatures or signeR and then move on to MutationalPatterns. I just don't know whether the output objects are compatible. I shall try though. But other than that, would you have any suggestion?

0
Entering edit mode

I am not able to compare my mutational catalogue with COSMIC 30. And that is exactly what I'd like to do.

If that is what all you want to do, check YAPSA package. It has functions to calculate exposures for known COSMIC signatures. You can also specify cut-offs e.g. a signature must be reported if it is made up of at lest 2% of given SNVs. This way you can filter out some of the less important signatures.

0
Entering edit mode
15 months ago
Sarah • 0

I got the same error and updating the package NMF and changing the function call nmf() to NMF::nmf() has solved it.