Question: soft threshold for co-expression analysis using WGCNA based on scale independence
0
gravatar for newbie
21 months ago by
newbie90
newbie90 wrote:

Hi,

I have the raw counts for around 100 tumor samples. I'm interested in co-expression analysis of single lncRNA with all protein coding genes to check which protein coding genes are strongly correlated.

I have a dataframe df with all protein coding genes and one lncRNA with counts data.

dim(df)
## [1] 19803   100

head(df)[1:5,1:4]

                    sample1        sample2          sample3          sample4
A1BG                  14               59               11               31
A1CF                   0                4                1                0
A2M                 6509             7708             7306            16869
A2ML1                 64               71             1317             3406
A3GALT2                7               28                8                0


U3 <- as.matrix(df)

library(DESeq2)
vsd <- vst(U3, blind=FALSE)
oed <- vsd

gene.names=rownames(oed)
trans.oed=t(oed)
dim(trans.oed)

n=19803;
datExpr=trans.oed[,1:n]
dim(datExpr)

SubGeneNames=gene.names[1:n]

library(WGCNA)
options(stringsAsFactors = FALSE);
allowWGCNAThreads()

powers = c(c(1:10), seq(from = 12, to=20, by=2));
sft=pickSoftThreshold(datExpr,dataIsExpr = TRUE,
                      powerVector = powers,corFnc = cor,
                      corOptions = list(use = 'p'),networkType = "unsigned")

And the plot look like this

enter image description here

What is the reason for the soft threshold power like above in the plot? which power should I select?

wgcna rna-seq co-expression R • 938 views
ADD COMMENTlink written 21 months ago by newbie90
1

Please search in your search engine. There are many questions on the soft thresholding power, how to choose the best value, and what this threshold means.

ADD REPLYlink written 21 months ago by Kevin Blighe70k

Hi,

I have checked some post and tutorials. I found the answer I need. Have a small question.

I'm trying to do this co-expression network between some interested lncRNAs and protein coding genes. But after getting the modules in the analysis, I see that all my interested lncRNAs are in grey module which is basically module with unassigned genes.

But I'm very interested in looking pc genes coexpressed with my interested lncRNAs. What I have to do now?

ADD REPLYlink written 21 months ago by newbie90

Maybe your lncRNAs have low expression, which is why they are in that module. What was your input data to WGCNA? - normalised counts or normalised + transformed (e.g. logged, Z-transformed) counts?

ADD REPLYlink written 21 months ago by Kevin Blighe70k

I have the raw counts for around 100 tumor samples. I used data which is a matrix with 100 samples and 14k protein coding genes and my interested lncRNAs. I used variance stabilised transformation from DEseq2.

vsd <- vst(data, blind=FALSE)

Along with 100 tumor samples, I also have some 50 normal samples. For WGCNA I used only 100 tumor samples, because I wanted to know co-expressed genes specific to tumor condition.

ADD REPLYlink modified 21 months ago • written 21 months ago by newbie90

May I know the answer please.

ADD REPLYlink written 21 months ago by newbie90
1

There is no further answer to give, really. Variance-stabilised counts should be okay for WGCNA. You could try rlog counts, instead, if you wished.

Going back a few steps, you should remove genes of low counts prior to normalisation in DESeq2. It seems strange that most of your lncRNAs are in the same module - the conclusion that I have is that most of them are originally of low expression, and perhaps should have been filtered out.

ADD REPLYlink modified 18 months ago • written 21 months ago by Kevin Blighe70k

Hey Kevin,

As I don't find any co-expressed genes with WGCNA for my interested lncRNA, I tried using correlation analysis using Pearson method and filtered based on pvalue < 0.05.

The co-express genes need to be only positive co-expressed genes i.e. r > +0.5

or I should also use genes with negative values also for Pathway analysis?

ADD REPLYlink written 20 months ago by newbie90
1

The negative genes are equally as informative as the positive genes, no? - you can analyse them together in pathway analysis, or do 2 separate analyses for:

  1. positive genes
  2. negative genes

You should only include the correlations with p-value < 0.05

ADD REPLYlink written 20 months ago by Kevin Blighe70k
1

Sure. thanq for the reply

ADD REPLYlink written 20 months ago by newbie90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1965 users visited in the last hour
_