Hello all,
To normalize my read count data I used 2 different approaches:
1) normalized them with DESeq2 and then transformed them to log2 format.
2) I only transformed the read count data to log2 format (without normalizing using DESeq2).
I realized that the outputs are pretty much the same!! Can any one tell me why is that so? I am confused if the output is the same then why we use DESeq2 to normalize them? why not only do log2 transformation??
This is part of my read count data:
                              WT1    WT2    WT3 ACA1 ACA2 ACA3
ENSMUSG00000022857         61     27     54     733     238     332
ENSMUSG00000094478         1    321      0       0       2       0
ENSMUSG00000096410         1225   1319    648     126      32     119
1) I normalized them using DESeq2 and then transformed to log2:
my script:
cds <- DESeqDataSetFromMatrix(read.count[,-1], colData, design = ~group)
dds <- estimateSizeFactors(cds)
normalized_df <- counts(dds, normalized=TRUE)
normalized_df.log <- log2(normalized_df+1)
this is part of the output after normalizing by DESeq2 and transforming to log2:
                          WT1       WT2       WT3   ACA1  ACA2  ACA3
ENSMUSG00000022857  5.9533944  4.821842  5.792608  9.524640  7.902013  8.380811
ENSMUSG00000094478  0.9995925  8.345891  0.000000  0.000000  1.585730  0.000000
ENSMUSG00000096410 10.2589289 10.381332  9.353513  6.993656  5.045510  6.908315
2) This is the result after only doing log2 transformation (without normalizing using DESeq2):
                         WT1       WT2       WT3   ACA1  ACA2  ACA3
ENSMUSG00000022857  5.954196  4.807355  5.781360  9.519636  7.900867  8.379378
ENSMUSG00000094478  1.000000  8.330917  0.000000  0.000000  1.584963  0.000000
ENSMUSG00000096410 10.259743 10.366322  9.342075  6.988685  5.044394  6.906891
Many thanks!

DESeq2 normalizes for library depth. If your samples are all from the same library, normalization may not have a pronounced effect. Could that be the case here?
no they are from different library, because the sum of readcount for each sample is different.
Its not cpm normalization where the count is divided to per million reads, it is more normalized with size factor, to convert non-normally distributed data to normally distributed data. As the data is normal distribution to begin with, may be the size factor is close to one! You can obtain the normalizing factor by dividing counts with normalized counts.
Please use the formatting bar (especially the
  
codeoption) to present your post better. You can use backticks for inline code (`text` becomestext), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.Why exactly you want to normalize counts by log2?
Can you also print output counts after normalization from DESeq2.
This is my data after normalization using DESeq2:
As stated below, the counts after normalization looks similar to original counts due to normal distribution of the original data.
Also I normally prefer using vst normalized data for pca or any other processing (like WGCNA)