Random seed in scanpy
1
0
Entering edit mode
3 months ago
bioinfo ▴ 160

Hello,

I am working with scanpy to analyze some single cell RNA seq data. I was wondering if I should set random.seed(0) at the beginning of my jupyter notebook. Would that keep the results reproducible? Would it cause any issues?

Thank you

scanpy scRNAseq single-cell • 864 views
ADD COMMENT
0
Entering edit mode

Looks like random seed is used in: https://scanpy.readthedocs.io/en/stable/generated/scanpy.tl.umap.html and https://scanpy.readthedocs.io/en/stable/generated/scanpy.pp.sample.html#scanpy.pp.sample You may need to set the seed for each tool though since it seems to be an option.

ADD REPLY
0
Entering edit mode

I realized that I have a few notebooks where I set random.seed(0) at the beginning of the jupyter file but not in the commands you mentioned previously. Is there any chance that this may have caused issues that it would affect the results (besides reproducibility)?

ADD REPLY
0
Entering edit mode
3 months ago
ATpoint 89k

Generally, you need fixed seeds if you want to make analysis reproducible that has a random element. I cannot speak for ScanPy and Python, but in R (towards single-cell, and generally) this could be UMAP/PCA (and most dimensionality reductions), Kmeans and some other clustering approaches, subsampling procedures and more. If there is an option to set a fixed seed then I would always do that (in fact I do). In R you set a seed before calling the function.

set.seed(1)
doSomething()

...and then the seed is wasted. Setting this once on top of your script is not enough, it will be vanished once the first function that has a random element uses it. Needs to be set before every function. Python might be different. I would check if running analysis several times give precisely same results. If not, could be a seed problem.

ADD COMMENT

Login before adding your answer.

Traffic: 4544 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6