Question: co-expression analysis from a scRNA-seq data
0
gravatar for hubenxia123
26 days ago by
hubenxia1230 wrote:

I have downloaded a public expression matrix for a scRNA-seq. Does anyone know how to perform Gene-Gene Co-expression, like this paper Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain. Best,

rna-seq • 180 views
ADD COMMENTlink modified 26 days ago by Kevin Blighe52k • written 26 days ago by hubenxia1230
0
gravatar for Kevin Blighe
26 days ago by
Kevin Blighe52k
Kevin Blighe52k wrote:

Hey,

You can read the methods of the work that you cite, and, in that way, follow what the authors did. Go here and then go to STAR Methods.

The 2 sections within those methods that you will want to review are

  • ICA based analysis and clustering
  • Correlation analysis across cell populations

Kevin

ADD COMMENTlink modified 26 days ago • written 26 days ago by Kevin Blighe52k

if I know how to do that, I wouldn't ask this question.

ADD REPLYlink written 26 days ago by hubenxia1230

Hello, which part, specifically, are you finding it difficult to follow? I took a closer look myself and can deduce the following rough steps to help you get started:

Step 1 - filtering

  • Filter out cells with fewer than 400 expressed genes
  • Filter include highly variable genes across all tissues (you can use your own metrics, if you wish)

Step 2 - ICA (independent component analysis)

  • Convert highly variable gene matrices to Z-scores ("[The] selected genes were then centered and scaled across all cells")
  • Perform ICA using fastICA package in R, configured to output the first 60 components, and performed separately on each tissue.

Step 3 - KNN clustering

Perform clustering on the 60 ICA components using the cluster implementation in Seurat. Basically, re-use Seurat's functions FindNeighbors() and FindClusters(). I use these in a function in a package that I'm currently developing, to give you an idea: https://github.com/kevinblighe/scToolkit/blob/master/R/clusKNN.R

--------------------------

That should bring you up to the line "To identify finer substructure among these classes, classes with more than 200 cells were selected for subclustering", whereby they then commence a second round of ICA on a finer subset of genes, it seems.

Unfortunately, following bioinformatics methods can be a nightmare, because it is impossible to accurately write in English language the minute details that are required to comprise a comprehensive methodology.

ADD REPLYlink modified 26 days ago • written 26 days ago by Kevin Blighe52k

You might also want to take a look at this article to get ideas for alternatives to pearson correlation.

ADD REPLYlink written 25 days ago by kristoffer.vittingseerup2.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 694 users visited in the last hour