Cell Type Marker Databases?
10.4 years ago

Hi, I've been looking to see if there is any kind of public database or resource for cell type-specific markers. These would include genes for proteins that can uniquely identify a cell population using techniques such as FACS. I've tried a variety of different search terms, but haven't managed to come up with much. I figured I would check here to make sure I may not be missing something out there before resorting to literature mining and meta-analysis of GEO data. So far the best I have found are lists of sequences used in commercial cell type validation PCR arrays, for which no citations are given...

Thank you

Hi Ashwini, I wonder if your cell Marker Database has signatures of different immune cells for human?. We try to build assay which can do scRNA-seq of immune cells from PBMC and other tissues. I wonder how to get access to human immune marker dataset? Best, Alex Chenchik Cellecta

Hi, please do not add answers unless you're answering the top-level question. Use Add Comment or Add Reply instead.

Hi Alex,

Apologies for the late reply. We have marker gene sets for human immune cells in CellKb, available by subscription. We also have a database of mouse immune cell markers, CellKb Immune, which is free for academic users. Please let me know if you would like to know more.

Thank you, Ashwini

4.0 years ago
Here is one such marker database: https://panglaodb.se/markers.html

2.2 years ago

Try CellKb.

It contains about 19,000 cell type signatures manually collected from literature and extensively curated.

Searching of hematopoietic cell types is free for academic users through CellKb Immune.

Wow, a commercial database for this. I have no idea how this would work for a research project. Say you use this database in a paper, no one else could reproduce the results? Also, this doesn't answer the question, like all of us, the question was about a public database, because we want to build algorithms and run this on our dataset. With this closed-access commercial database none of this would work. Also, without any comparison of a commercial database against other resources, why would anyone buy it?

CellKb is a collection of marker gene sets published in literature and uses a rank-based overlap method to identify gene sets that match a query gene list in your search. You can download the list of marker gene sets that match with your query list from CellKb along with the source publication. This will allow others to reproduce the results.

CellKb tries to address several issues with existing public databases. Please see my review on Medium for more details: https://medium.com/@ashwini_21267/simplifying-cell-type-identification-in-single-cell-experiments-ca7af7382e51.

Unfortunately, CellKb is not a public database because we have no academic or government funding. But it is very reasonably priced so students and academicians can afford a license. Also we are happy to collaborate with labs for specific research projects, and provide them access to the data.

Please let me know if you or your lab are interested in collaborating with us. I will be happy to give you a demo of CellKb and its features. We are releasing CellKb 2.0 with additional cell type marker sets and features by the end of August 2020!

I do not know an example of a successful biological database that is sold commercially for the reasons outlined above. I don't think there is a business model with biological databases. The only one I know who is making money is HGMD, but they provide a free academic version (always last year's and reduced information). No matter how good your database, academic will not buy it. I would recommend making a free version available and forseee a way to make the free version downloadable from pipelines with a wget command (and maybe a token or login or something).

Those are all very pertinent points and excellent suggestions. As an ex-academic myself, I actually agree with you. Ideally, I would like CellKb to be freely accessible to as many academics as possible and I am trying to figure out a model that will make it sustainable over the long term.

As for examples of cases where this has worked, you can look at the licenses of OMIM, UCSC Genome Browser, Genomenom and HGMD, there may also be others, this list is not comprehensive. BIND used to be one, as was TRANSFAC, but I'm not sure about their status now. Usually there is either a fully public version but with warning that it can't be used for commercial research, click-through licenses or limited versions (stripped or outdated) for academic researchers. OMIM generates a unique download link to the files, so they always know who generated the link originally. Commercial publishers also leave hidden watermarks in the PDFs they generate, so if they find a version used commercially, they know the user account from where it was downloaded.

Thanks for the examples. That helps!

10.4 years ago

For immune (and other) cell types you could start with some resources related to human clusters of differentiation, used in Immunophenotyping. Wikipedia has a decent list of CD molecules. This CD Antigens Table is a fairly rich resource.

There are also specific projects that seek to characterize the distinguishing genes of each tissue. For example the Human Protein Atlas, and the Mouse Atlas of Gene Expression.

11 months ago
hongbo919 ▴ 30

Try CellMarker http://biocc.hrbmu.edu.cn/CellMarker/ if you have no marker list in hand.

10.3 years ago

If you are looking at humans, I would consult Human Proteinpedia (See this page for data categorized using annotations and experimental platforms) and Human Protein Atlas before starting a de-novo literature curation or meta-analysis.

11 months ago
igor 13k

There is a cell type markers meta-database clustermole: https://cran.r-project.org/web/packages/clustermole/vignettes/clustermole-intro.html