Warning Using Dwd Matlab Package And Results Matrix Not Fully Non Negative
1
0
Entering edit mode
9.8 years ago
fbrundu ▴ 330

Hi all,

I am getting a strange warning using the DWD method (Distance Weighted Discrimination) to reduce batch effects on microarray data.

I am using matlab package provided here.

When I call BatchAdjustCC function on expressions matrix to adjust, it tells me twice:

!!!   Warning: rootfSM hit left end   !!!


Adjusted data matrix is not fully non negative, even if original matrix is fully non negative.

I don't know what I am doing wrong or if I have to set additional constraints.

Thanks

microarray matlab • 3.1k views
1
Entering edit mode
9.8 years ago

Neither of the warnings are affecting your output matrix of DWD-adjusted values, so you are probably running DWD correctly.

In the output PCA plots, a smooth histogram (kde) is overlayed to show the distribution of points from each class. The rootfSM warning can be ignored: it is related to determining the optimal smoothing parameter. Even with this warning, the smooth histograms almost always look OK.

The second warning is telling you that the output matrix of DWD-adjusted values contains some negative values (even if the input only contained positive values). This will only be an issue if you further analyze the data using a tool that requires the data to be non-negative.

0
Entering edit mode

Yes it is an issue because I have to run Non negative matrix factorization. Is there a way to obtain a fully positive matrix as output?

0
Entering edit mode

There is no way to "force" DWD to provide a non-negative matrix. I'm not very familiar with NMF, but is there a reason why you can't shift the matrix so that all elements are positive?

0
Entering edit mode

Ok, I am doing some research to understand if the matrix semantics can be compromised if shifted.. Thanks

0
Entering edit mode

I found how to bypass the problem of a not fully non positive matrix here: http://genepattern.broadinstitute.org/gp/pages/protocols/ClassDiscovery_nmf.html

Non-negative matrix factorization (NMF) requires positive gene expression values. To run NMF on data that contains negative values (Kim & Tidor, 2003): Create one dataset with all negative numbers zeroed. Create another dataset with all positive numbers zeroed and the signs of all negative numbers removed. Merge the two (eg. by concatenation), resulting in a dataset twice as large as the original, but with positive values only and zeros, hence appropriate for NMF. To do this in MATLAB, execute the following statement: anew=[max(a,0);-min(a,0)]; where a is the original data.