How to manage count-depth in single cell data to correctly analyze biological information?
0
1
Entering edit mode
3.8 years ago
Firingam ▴ 30

I have to normalize a single cell RNA seq dataframe (sc-RNA seq) with Bioconductor. To do this i decided to rely on SCnorm. Before its application I have to investigate over a few details, i.e. count-depth. In order to acquire this information, I want to apply plotCountDepth. This function provides a set of filter that i want to set. However I'm not sure about the biological significance of these filters. The main issue is with FilterExpression that will cut out a gene if its distribution median is below a certain threshold. So what is the correct biological approach to choose a threshold?

RNA-Seq R genome • 1.0k views
ADD COMMENT
0
Entering edit mode

Completely unclear (at least to me) what the issue is. Please try to explain better.

ADD REPLY
0
Entering edit mode

If you have a single cell dataframe, how would you choose gene to cut off from your rows basing your decision on their medians?I explain better : you have a n X m matrix where n are genes and m are cells. I want to explore the count-depth feature in order to choose the.normalization process hereafter. The genes which will have their median (median is taken from the gene expression distribution where each value is the gene expression in a cell) below a certain threshold will be excluded from your analysis. I'm uncertain about how to choose this threshold. I live you the link of plotCountDepth so you can check on FilterExpression field.

ADD REPLY
1
Entering edit mode

So to clarify, you're trying to determine a threshold for removing genes that are not expressed? In general, this is done by removing genes expressed in very few cells (say < 10 or even < 3 if you have few cells and think you may have rare populations). It's rather arbitrary, but removes most of the genes that don't provide any useful info or have much of a biological impact.

I have not seen folks filter on actual expression levels, though I guess you could rank markers by median/average expression to help identify those with more "robust" changes.

ADD REPLY
0
Entering edit mode

I applied another strategy related to the batch and cell types. I found an article that explains the ratio and the tissue where the cells have been taken. I normalized and filtered depending on the tissue (presuming to) preserving the biological identity

ADD REPLY

Login before adding your answer.

Traffic: 1819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6