Question: Feature selection for differential analysis
10 months ago by
wo_li10 wrote:

Hello all,

I'm doing the single cell RNA-Seq and trying to find out genes showing different expression under two treatments. Thus my first step is to put two datasets from different treatments together and the second is to compare the difference.

When I merge two single cell RNA-seq datasets together, I notice that there are some unique genes from the previous two datasets and they are maintained by Seurat::merge(). I'm wondering whether I should keep these unique genes? Will they affect the downstream analysis of differential analysis?

Thank you in advance!

modified 10 months ago by kristoffer.vittingseerup3.3k • written 10 months ago by wo_li10

You should not keep the unique genes. If there is only a small number of them then you might be OK just excluding them. If it is a huge proportion of genes then what @kristoffer.vittingseerup suggested below is probably wise.

written 10 months ago by benformatics1.6k

Thank you and agree! It's only a small proportion and their expression are also low in the other dataset.

written 10 months ago by wo_li10
10 months ago by
European Union
kristoffer.vittingseerup3.3k wrote:

There should not be unique genes - then it is because your filtering step(s) have removed them (meaning they are candidate for being highly expressed in one set and not the other). I would suggest merging the data before you start the Seurat analysis so you have a joint QC, filtering etc.

Please also be aware of batch effects - if you ran the single-cell data as two different rounds you might have a large batch effect - then you need to use approaches such as Seurat's data integration.

written 10 months ago by kristoffer.vittingseerup3.3k

Thank you, Kristoffer! Yes, these genes are filtered first as they have no expression under that treatment. As you suggest, I put them together to do the QC.

written 10 months ago by wo_li10
