Question: How to enter gene list in R for WGCNA analysis
0
gravatar for ghataksoumyakanti
5 weeks ago by
ghataksoumyakanti0 wrote:

Hi guys,I have a gene list that I have got after DEG analysis for 3 different GSE datasets.Not I want to perform a weighted gene co-expression analysis on this DEG list.Which format or method should i make my file so that i can use the list in R.Should I just put the gene list in a column ? or should i annotate each gene..

wgcna module R gene • 232 views
ADD COMMENTlink modified 5 weeks ago by genomax75k • written 5 weeks ago by ghataksoumyakanti0

As Leite mentioned, please follow the WGCNA tutorial, so that you can understand what should be the input format of your data.

ADD REPLYlink written 5 weeks ago by Kevin Blighe52k
3
gravatar for Leite
5 weeks ago by
Leite930
São Paulo - Brazil - Unifesp
Leite930 wrote:

Dear @ ghataksoumyakanti

The initial entry is a normalized gene expression matrix, with the rows representing the genes and columns representing the sampleS.

I think you should spend some time reading WGCNA TUTORIAL if you haven't done it yet.

If you are experiencing difficulties with the WGCNA, The CEMiTool can be an easier way to do the same analysis.

ADD COMMENTlink modified 5 weeks ago by Kevin Blighe52k • written 5 weeks ago by Leite930

How do I make the normalised matrix..is it the same format which we obtained after deg analysis using r as a final result or do we have to make the matrix manually.

ADD REPLYlink written 5 weeks ago by ghataksoumyakanti0

I don't know your data, I don't know what normalization you do ...

But usually after:

 Background correcting

 Normalizing

 log2-transformation

You build the design matrix for the linear modelling function

 f <- factor(targets$Target, levels = unique(targets$Target))
 design <- model.matrix(~0 + f)
 colnames(design) <- levels(f)

and apply the intensity values to lmFit

 fit <- lmFit(data.norm, design)

And then write them to disk:

 write.table(fit, file="data_norm.txt", sep="\t", quote=FALSE)

the data_norm.txt is what you will use in WGCNA or webCEMiTool.

This can help you with the normalization question:

Data Analysis in Genome Biology

Microarray-analysis

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Leite930

Yes...i used the write.table command for getting my deg analysis result in r.so the same format in which i got my result deg i can use for input in wcgna analysis right...and do i need any furthur conversion from gene id to probe od or vice versa

ADD REPLYlink written 5 weeks ago by ghataksoumyakanti0
1

The input to WGCNA should be a matrix of numerical values, with samples as columns, and variables (usually genes) as rows.

WGCNA does not care about your variable names. They can be probe IDs, HGNC symbols, or, simply, numbers going from 1:n. However, you should obviously have the variable names in a format that is understood by you.

Please go through the WGCNA tutorial, first, so that you can understand how to use it.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe52k
1

Sometimes people just don't want to spend their time reading the tutorials.

ADD REPLYlink written 5 weeks ago by Leite930

Sorry if I bothered you but I am novice in the field and thus had some doubts...

ADD REPLYlink written 5 weeks ago by ghataksoumyakanti0

No problem. Please do take a look at the tutorial, and then report back.

ADD REPLYlink modified 5 weeks ago • written 5 weeks ago by Kevin Blighe52k

Ohk...I will try the same and get back in case there is a problem...I actually went through the tutorial but cud not make out the input part much..

ADD REPLYlink written 5 weeks ago by ghataksoumyakanti0

The WGCNA tutorial is not great, and the authors should improve it - I admit that. At a certain point in it, you have to download sample files that you then input to R. Did you get to that stage?

ADD REPLYlink written 5 weeks ago by Kevin Blighe52k

By sample files do you mean the original female mice liver data that the original papre was based upon??No sadly I did not get to that stage.I am still struggling as to how to enter my deg analysis data.

ADD REPLYlink written 4 weeks ago by ghataksoumyakanti0

This is what R says when I use the goodsamplesgenes function gsg = goodSamplesGenes(datExpr0, verbose = 3); Flagging genes and samples with too many missing values... ..step 1 Error in goodGenes(datExpr, weights, goodSamples, goodGenes, minFraction = minFraction, : datExpr must contain numeric data.

My first column contains logFC values for all the degs

ADD REPLYlink written 4 weeks ago by ghataksoumyakanti0

@ghataksoumyakanti As far as I know you shouldn't use the list of DEGs. As Kevin and I already said you should use the matrix of gene expression "properly normalized"

The WGCNA method receives an input “m x n” gene expression matrix, containing n samples under specific conditions and m genes, where each element in the matrix gives the expression of one gene in a particular sample. The correlation between each pair of genes is then transformed into an m x m adjacency matrix through an adjacency function (reference).

ADD REPLYlink written 4 weeks ago by Leite930

Yes, the liver data. Okay, let us go back to the beginning: can you list the key objects in your workspace, and show a sample of these (e.g., using the head() function

ADD REPLYlink written 4 weeks ago by Kevin Blighe52k

column names are [1] "logFC" "AveExpr" "t" [4] "P.Value" "adj.P.Val" "B" [7] "gene.symbols" "X"

These are the column names of the input file i was tryinv to use.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by ghataksoumyakanti0
1

For WGCNA, your input data should be your unfiltered expression matrix. You do not require the differential expression results for WGCNA.

ADD REPLYlink written 4 weeks ago by Kevin Blighe52k

If he doesn't ignore what I answer he would have understood it yesterday. Good luck

ADD REPLYlink written 4 weeks ago by Leite930

Dear @ghataksoumyakanti,

I think you did not quite understand what I said:

The initial entry is a normalized gene expression matrix, with the rows representing the genes and columns representing the sampleS.

You don't use your list of DEGs but the normalized gene expression matrix.

For conversion gene id to probe id, see these answers:

Question: Annotate Affymetrix probesets to Gene symbols

Question: Affymetrix Human Genome U133 Plus 2.0 Array - probe annotation with biomaRt

Question: Where To Find Annotation File For Agilent Microarray?

ADD REPLYlink written 5 weeks ago by Leite930

Ohk...so i hve figured out that I hve to merge the data and then remove the batch effect using insilicomerging method.do you guys suggest any othe function to merge or any other way of analysing of is this sufficient.

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by ghataksoumyakanti0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1616 users visited in the last hour