Entering edit mode
23 hours ago
Saber_J
▴
20
Hi everyone,
I have single-cell proteomics data from different developmental stages and am exploring analysis options. I'm wondering if it would be appropriate to use scRNA-seq analysis packages, such as Seurat v5, to integrate this data and identify cell cluster markers.
The data was provided by our mass spectrometry platform as a normalized TSV file. The rows represent protein names, and the columns correspond to individual cell names, similar to the format of a typical gene expression matrix.
Here's a small example of what the table looks like:
Probably that's fine but be sure to skip the normalization steps. Also avoid count-based steps such as sctransform for feature selection. Should all be done directly on this matrix which is log2 I suppose. That data matrix (probably you want imputation) needs to go into right slot of the log2-transformed counts. From there on it should be more or less the same.
Thanks for the reply and warm reminder. This table also provides count-like data without normalization, where the values are integers. If I want to use the sctransform method for dimensionality reduction, can I use this integer count data? However, our technician suggested I directly use the normalized data for analysis, as it is a result calculated directly from the commercial software we purchased.
I was assuming your table is log2 intensity as often for MS-based proteomics. I do not know this sort of data you have so I cannot comment.
While we wait for an answer this should get you started: https://uclouvain-cbio.github.io/scp/articles/scp.html
Yes, I've already looked into this workflow. Since my data is already normalized, I was wondering if I could directly use the Seurat package. However, I've noticed that the
scp
package also provides functions for batch correction, dimension reduction, and useful visualization tools.