Entering edit mode
5.9 years ago
Jeremy Leipzig
23k
If you were going to reproduce a paper with just a high-level methods section what would contribute to the "degrees of freedom"
- Preprocessing tools
- Analysis packages
- Parameters
- Normalization
- Feature selection
- Decomposition
- Statistical models
- Computing infrastructure
- Program versions
- Random seeds
can you think of others? Perhaps more importantly, how would you organize these into logical categories?
right data
Authors don't always submit data in the most accessible format (e.g. single unaligned BAM file for10x data for multiple samples with cell indexes in optional strings at end of SAM record). They may not submit truly raw data and neglect to include that fact and information about pre-processing.
I think you can divide those in "choices" (e.g. parameters) and "circumstances" (versions, seeds, infrastructure, operating system,...).
I think there was a publication about this recently, about scRNAseq if I'm not mistaken