Question: Identifying tumor cells in single cell RNAseq data
1
gravatar for jrleary
8 months ago by
jrleary130
Lineberger Comprehensive Cancer Center
jrleary130 wrote:

My lab does a lot of single cell RNAseq of samples that include tumor cells. I have a pipeline in place that reduces dimension, clusters cells, and automatically assigns cell type labels to those clusters (mostly using Seurat and SingleR). However, I do not currently have a way to differentiate tumor cells from normal cells, which would allow me to perform other downstream analyses such as copy number variation using inferCNV. Are there any tools in R/Python/Bash etc. that would allow me to differentiate between those two cell types, or is doing so manually using biomarkers the best option? I'd like for the process to be as objective / reproducible as possible.

cancer scrnaseq R • 375 views
ADD COMMENTlink modified 8 months ago by jared.andrews077.5k • written 8 months ago by jrleary130
2

It will depend on your tumor type. Some are easier to differentiate than others.

You don't necessarily need to identify tumor cells to call CNVs. In fact, you could call CNVs to identify tumor cells.

ADD REPLYlink written 8 months ago by igor11k

I think I'll be giving CONICSmat a go as a method of identifying CNVs / tumor cells. Thanks for the input.

ADD REPLYlink written 8 months ago by jrleary130

If you are interested, there are a few other alternatives discussed here: Detecting copy number alterations based on RNA-seq data

ADD REPLYlink written 8 months ago by igor11k

That thread of yours is where I found CONICSmat in the first place :) With regards to using CNVs to identify tumor cells, are there any best-practices documents / guides floating around? I come from a pure stats background so I'm less familiar with some of the (to me) more complicated biology concepts.

ADD REPLYlink written 8 months ago by jrleary130
2

Usually, only tumor cells should have copy number abnormalities. In panel B below (from Patel et al), you can see the topmost cluster has a flat copy number profile and contains the normal cells.

enter image description here

ADD REPLYlink modified 8 months ago • written 8 months ago by igor11k

This is generally true, but does depend somewhat on the tumor type. Certain leukemias have "progenitor" or "poised" populations that may still harbor significant genetic variation despite not being truly malignant. This is where your biological expertise is going to have to come into play.

ADD REPLYlink written 8 months ago by jared.andrews077.5k

Thank you both. I had been using the Patel paper as a reference but it looks like I'm going to have to do a much deeper dive research-wise before I start analyzing anything. I don't want to be lacking in domain knowledge.

ADD REPLYlink written 8 months ago by jrleary130
1

It really helps if you know what to look for. If you have any clinical karyotype data, it can make your life a lot easier. scRNA CNV calling is coarse - you aren't going to pick up many focal changes (< 1MB). If you have a clinical collaborator that provided you the samples, bug them to give you any information they might have available. If your cancer of interest has very recurrent copy number alterations, that can also help, but there are always variations. Speaking from experience, prior information makes the process much, much easier.

ADD REPLYlink written 8 months ago by jared.andrews077.5k

I'll see what I can do, but I believe at the moment we only have scRNA data, maybe some paired bulk RNAseq data. With those resources, do you think trying to estimate CNVs is worth the time or would it be too noisy?

ADD REPLYlink written 8 months ago by jrleary130

Oh, it can totally be valuable. I'm just not sure it's the best tool to differentiate malignant and normal cells, but again, that's highly cancer-type dependent.

It's also not terribly difficult to do, so I'd say the upside is strong - just trying to make sure you're aware of some of the caveats.

ADD REPLYlink written 8 months ago by jared.andrews077.5k

The data we're analyzing is from PDAC, so I'll be doing some pancreas-specific research. Are there any other extant computational methods you'd recommend for differentiating between malignant and normal?

ADD REPLYlink written 8 months ago by jrleary130
1

I'm not familiar with that cancer type, so I'm afraid you're on your own there. The suggestions in my answer might be helpful, but I don't know enough about the data/cancer to say which is your best bet.

ADD REPLYlink written 8 months ago by jared.andrews077.5k
2

Just an update for future readers: I've had decent success replicating CNV analyses with CONICSmat on publicly available PDAC scRNA-seq data. Obviously processing, filtering, normalization, etc. methods are going to differ between labs but I've been able to see the strongest amplifications and deletions fairly clearly after my analysis.

ADD REPLYlink modified 7 months ago • written 7 months ago by jrleary130
3
gravatar for jared.andrews07
8 months ago by
jared.andrews077.5k
Memphis, TN
jared.andrews077.5k wrote:

As igor said, it really depends on the type. For immune cells, the easiest way to do this is usually to perform single cell VDJ sequencing on the same cells, which yields clonality information (most immune cancers are highly clonal).

If your cancer has a heavy genetic component, you can try utilize mutation information if you know the malignant populations harbor a given mutation (like if you performed bulk exome-seq or WGS). vartrix is a pretty easy tool to use for this. You can then identify malignant clusters pretty easily by enrichment of the mutation.

Lastly, if you've used SingleR, you have a pretty good idea of which clusters contain which cell types. Pick a cluster of cells that shouldn't be malignant to use as your controls.

ADD COMMENTlink written 8 months ago by jared.andrews077.5k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2067 users visited in the last hour