I would like to investigate about the relation of RNA-Seq data together, i.e. making gene networks.

Can I use **linear** regression model after log-transformation of data, i.e. log(read_count+1)?

Actually, I have done it and the results appear meaningful, but I am doubted about the process.

Appreciate for any thought.

Agreed, but note that when you talk about an exponential gene expression, it's really an exponential read-count relationship. We don't know the real relationship between readcount and expression. Some people think log(reads) is proportional to expression already.

Yes I do think so as well (that log(reads) is proportional to expression). This is because the original RNA molecules are amplifified in an exponential fashion (PCR) before they are measured by microarray/RNAseq/Taqman/any other assay that measures PCR-amplified RNA

I think log transformed count indeed maintains the original proportion, no? However for the purpose of the linear model, would log transformed count cause non-linearity?

Would log-transformed value violate the linearity assumption required for linear regression? Are independent variables and dependent variables still linear correlated after transformation?