I am doing some analysis on a public scRNAseq datasets in order to see differential gene expression between two clusters.
The basal sample information about it:
tissue_donor_1_treatment
tissue_donor_2_treatment
tissue_donor_1_control
tissue_donor_1_control
All of them produced under the same sequencing conditions.
In my opinion, I want to divide them into two groups: treatment and control.
According to the seurat v3 tutorial https://satijalab.org/seurat/archive/v3.1/immune_alignment.html, I did similar analysis and get a result.
My question is if I need to to do integration to remove batch effect based on my original purpose (to see differential gene expression between treatment and control)?
I did it and get some results.
But I also just merged them together simply and the skipped integration step and then do the same analysis but the clusters information were really different from that produced from integration.
My second question: If we do just merge or integration or not should consider our own purpose, could some body give me better merge or integration methods to see the differential gene expression ?
For example, at the merge step, I merge them together by one step and then follow the basic analysis workflow. Shall I merge them together by groups?
I know there are many tutorial information in details but I really hope somebody could help me on my questions.
Thank you in advance.
Anyone here?
It's not an on-demand service here. People will respond if they feel like having an answer, it's not appropriate to make comments like that 2h after posting a question.