I wanted to understand whether gene length normalization has an effect on causal discovery methods. Suppose I have raw gene count and now I normalize it in two ways (1) by TPM (TPM normalizes by gene length also) and (2) by following DESeq2 scaling factors (DESeq don't account for gene length). So now I have two different normalized count data. Now I log2 transform both data and apply the causal DIscovery method (CIT Millstein, Joshua, et al. "Disentangling molecular relationships with a causal inference test." BMC genetics 10.1 (2009): 1-15.).
- Why I see differences in the causal relationship identified? Does scaling have an effect on Causal Discovery methods?
- How do I interpret the differences theoretically?