Here is my question and your input is well appreciated. Experiment has 3 groups and each group has only one sample. Three groups are control, treated 1, treated 2. Treated 1 and Treated 2 have library size of 18 and 20 million reads respectively. Control has a library size of 70 million reads. Library size numbers are raw read counts. In addition, control sample is from a different run. Sequencing is illumina and organism is human. I have following questions:
a) Can i use control from a different run?
b) Since the control has 3 times the reads compared to experimental samples, how can I normalize the reads?
c) Does TMM/rlog take care of huge discrepancy in read numbers?
d) Do I have to include batch in my model matrix and design (in edger)?
Any help is appreciated and thanks.