Question: Metabolomics log-transform or standardize for WGCNA
0
gravatar for jane.toka
3.4 years ago by
jane.toka20
jane.toka20 wrote:

Hi all,

I have targeted metabolomics data which have been normalized using the loess method. I want to ask if further data pre-processing is needed namely standardization (divide with standard deviation of each metabolite) or log transformation of each metabolite because I have huge values. The distributions look slightly skewed for some metabolites.

I want to run a network analysis using WGCNA (weighted gene co-expression network analysis) which is based on computing pairwise correlations. Thus I'm wondering if it is important to standardize or log-transform the data or apply another pre-processing approach before starting the analysis of the metabolites.

Thanks in advance for your help.

Best, Jane

ADD COMMENTlink modified 22 months ago by theobroma221.1k • written 3.4 years ago by jane.toka20
2
gravatar for Kevin Blighe
22 months ago by
Kevin Blighe48k
Kevin Blighe48k wrote:

Hi Jane,

Apologies that no-one had answered. I have just been working on metabolomics and network analysis during my postdoc in Boston.

You may want to take a look at my recent answer here: The RNA-Seq data input for WGCNA in terms of gene co-expression network construction

Your post popped up on the right as a 'similar post'.

From my experience with metabolomics, specifically, note the following processing steps (from the raw metabolite levels):

1) Remove metabolites if:

  • Level in QC samples has coefficient of variation (CoV) > 25%
  • Missingness > 10% across test samples
  • No variability across test samples based on interquartile range (IQR)
  • Remove samples with metabolite missingness > 10%

2) Filter out unidentified/unknown metabolites and those classified as xenobiotic chemicals

3) Convert NA values to 0

[You can also impute NAs with half the lowest value in the dataset]

After that, you could log the data or convert it to the Z scale. WGCNA will accept unlogged data, too. At the end of the day, WGCNA is based on correlation.

Kevin

ADD COMMENTlink modified 22 months ago • written 22 months ago by Kevin Blighe48k
1
gravatar for theobroma22
22 months ago by
theobroma221.1k
theobroma221.1k wrote:

If you can access the mzdata files, the XCMS package on Bioconductor is very handy and thorough! It will annotate your fragments, or you could plug them into WGCNA but I've never did it this way. In XCMS, parse out the highest peak in the peak group and use this as your representative peak for that metabolite.

ADD COMMENTlink written 22 months ago by theobroma221.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1412 users visited in the last hour