STARsolo --EmptyDrops_CR parameters
1
0
Entering edit mode
2.8 years ago
PianoEntropy ▴ 70

I'm having some issues with the alignment of 10x scRNAseq data to a custom genome and I thought it would be good to do a stricter filtering on the cells so more cells get filtered out as empty droplets. Apparently STARsolo has several filtering algorithms, of which the EmptyDrops one is most similar to the CellRanger filtering. The manual lists 10 different parameters, but I can't make sense of what they mean and there's no further explanation.

In STARsolo, this filtering can be activated by: --soloCellFilter EmptyDrops_CR. It can be followed by 10 numeric parameters: nExpectedCells (3000), maxPercentile (0.99), maxMinRatio (10), indMin (45000), indMax (90000), umiMin (500), umiMinFracMedian (0.01), candMaxN (20000), FDR (0.01), simN (10000).

Now if I just want a stricter threshold on e.g. UMIs/cell for the filtering (see picture), should I increase umiMin or also set anything else? If someone can provide a full explanation of the parameters that would also be great.

I just want to increase the cutoff.

RNA-seq alignment STAR • 2.1k views
ADD COMMENT
0
Entering edit mode
2.8 years ago
Rob 6.5k

The EmptyDrops_CR filter is, indeed, very similar to what is done by Cell Ranger. Alex did a fantastic job reverse-engineering what Cell Ranger is doing internally. Unfortunately, Cell Ranger's defaults (and hence those inherited by solo) are not particularly well-described, and it's also unclear the range of experiments for which these defaults should be reasonable versus those where they should be changed.

Recently, as part of a much larger project, a student of mine re-re-implemented this methodology in R. This link to the pull request on the DropletUtils repository includes function documentation that describes the effects of the different parameters as we understood them when re-implementing the approach.

ADD COMMENT
0
Entering edit mode

what should the command look like Rob? I tried

--soloCellFilterType=EmptyDrops_CR nExpectedCells=10000

and I am getting the error

EXITING because of fatal PARAMETERS error: unrecognized option in --soloCellFilterType=EmptyDrops_CR nExpectedCells=10000
ADD REPLY
0
Entering edit mode

May I ask how is that going on cause I'm having the same problems. I was playing with a single nuclei dataset of four teleost samples. Thought it will be better to do a stricter filtering, but the result with --soloCellFilter EmptyDrops_CR showed a big differences between my previous alignment with --soloCellFilter TopCells 25000. TopCells identified around 25,000 cells in each samples, while EmptyDrops_CR believe there are 7,000 cells in one of my samples and the rest three contains around 25,000 cells. Should I change any numeric parameters with EmptyDrops_CR or could it just be the porblem of that specific sample?

ADD REPLY

Login before adding your answer.

Traffic: 1548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6