Hi everybody
I am new using this type of tools WGCNA. My question is about the structure of the input file to perform data analysis with WGCNA (the expression matrix). Previously, I already ran the test examples proposed by the author in his tutorial, which are microarray data from mice samples. That went well. Now that I have my own expression data ready, which are RNASeq data. I am trying to figure out how to properly integrate them into the input array.
Name Length EffectiveLength TPM NumReads
TRINITY_DN717_c0_g1_i2 352 103.002 222.078617 80
TRINITY_DN28_c0_g1_i2 580 331 32.696539 37.85
The Name column refers to the name of the transcript given automatically by the quantification tool, the TPM column is the normalized expression data, the others are auxiliary data. Now, deleting the data that I do not need and according to the WGCNA manual I only need the expression data of each sample with the following structure:
Transcrito Muestra 1 Muestra 2 Muestra 3 Muestra 15
transcrito x1 32.696539 222.078617 9197.892703 1573.957379
transcrito x2 37.863293 32.696539 951.276886 59.732617
transcrito x3 1036.145083 37.863293 2210.677996 87.906859
Now, I want to understand the logic of this matrix. At first I don't know if I should include column and row headers in the matrix. Next issue I don't understand at this point, is that for example: reading the expression data of transcript x1 (row 1) on all the samples, they don't necessarily refer to the information to the same transcript, so I would like to start by clarifying this point ...
Appreciate any help.
Thanks and regards Cynthia SC
P.S. The data entry (tutorial) in the Hovarth tutorial are microarrays and I am unclear about how to compose the data-input for RNASeq data.