I'm working on a RNA-Seq dataset and would like to use the TMM normalization from the edgeR package to normalize the data. I have read the manual and also the paper here.
I have two questions regarding the TMM normalization.
first, in our data, we are mostly interested in specific regions on the chromosomes. For that reason we extracted these regions from the complete mapped bam files using samtools. Does it make a different for the TMM normalization if I am taking only the extracted specific regions into account when normalizing the data rather than taking the whole library.
I know that the values I'm getting at the end will differ due to the fact that I have different numbers of reads mapped to the region of interest. BUT all in all, can I use the TMM normalization only on the extracted subset of the data?
Second, Can someone please try to explain to me the main difference between the scaling method of normalization and the normalization by library size?
I don't think I really got it from the paper.