Holle everyone! I am a beginner in RNA-seq data analysis. In my recently work, I face a problem: my RNA-seq data has significant correlation with sample's RIN (RNA integrity number), and I want to remove this effect. Some papers indicate that one can use a linear model to normalize this technical covariate, but I do not know how to do it. If anybody can help me about this? Or if there have any other methods can sovel this problem? Thank you very much!
You may want to try looking at the supplements for the GTEx papers. They did a lot of work on that. It also depends a lot on the protocol. If it is PolyA selected you will get biases against short genes because you will lose the 5' end. If it is RNA depleted or hybrid capture your bias will be different, so make sure the model you are looking at is applicable. The GTEx, I believe, was mostly poly-A kits.