I'm new to RNA-seq data analysis and i need help to remove some batch effect from my RNA-seq data. First, my data structure is a little complex, similar to some extent to the example in the edgeR manual in 3.5.
In my case, I have 12 RNA samples collected from 6 subjects. The samples correspond to 3 subjects treated with one treatment (treatment1) and another 3 subject treated with another treatment (treatment2) and in both cases RNA samples were extracted before and after treatment. The idea is to check the effect of the treatments respect their initial state and the differences between treatments counting the paired structure of the data. The problem is that I have a batch effect result of using a different rna extraction technique (although both methods with good quality samples, with a RIN superior to 8) (checked the batch in a multidimensional scaling plot). The frame looks like this:
Treatment Patient Time batch Treatment1 1 before 1 Treatment1 1 after 1 Treatment1 2 before 1 Treatment1 2 after 1 Treatment1 3 before 1 Treatment1 3 after 1 Treatment2 4 before 1 Treatment2 4 after 2 Treatment2 5 before 1 Treatment2 5 after 2 Treatment2 6 before 1 Treatment2 6 after 2
The problem is that my batch effect is totally unbalanced (it's only present in the treatment2 group after treatment) and it's impossible to apply any correction in that case.
Apart from that, I have a sample obtained and sequenced with both extraction methods (a sample from Treatment 1 after). Although It's from the other treatment, supposing that the batch effect has the same effect independent of the treatment, I would like to ask methods or recomendations to correct my samples (treatment 2) based in the differences found in the sample obtained with both extraction methods.
Thanks in advance.