So I have two datasets from two different yet related cell lines: pre and post relapse cancer cell line from same patient.
I have performed single cell sequencing for both of them with the hypothesis that I will be able to find rarely expressed cells in either that are key to becoming relapse.
To do that I am following two protocols:
1) I am following the Seurat tutorial of integrating simulated and normal pbmc
2) I am following this second tutorial where there is no integration steps involved.
However both these approaches give two different results! Using 1) I am getting more cells from two different lines to be similar with very few cells from each to be different. Using 2) I am getting opposite, yet expected, result that most of the cells are different with only few common cells.
Which one is the correct way of analyzing? Are two approaches giving me two different results become they are doing different things? 1) is finding common genes and 2) finding distinct genes?