I have a question regarding the interpretation of inferCNV output using https://github.com/broadinstitute/infercnv. I have single-nuclei RNA sequencing data of both early and late disease stage from a patient. I wanted to use inferCNV to check what copy number variants occur along disease progression.
To do so, I put all cells of early disease stage as "reference" and all cells of late stage as "observations". Here attached is the image of final output I got (after noise-filtering).
It seems that this patient does not have very obvious CNVs during progression. But I have doubts about chromosome 6, as I do see signals of both red and blue in my reference cells. Are these outliers acceptable? Maybe it is a good idea to set ref_group_names=NULL and use average signal as baseline?
Any insights and comments are appreciated. Thanks!!
I don't know why you call those cells in your reference outliers - they're a small population of cells with both the chromosome 6 amplification, some of which also have the chromosome 1 amplification apparent in the later sample. I don't know how pure your samples are, so those could be the malignant cells in your early sample while most of the others are "normal" cells. Or they could all be disease cells and that's a clonal population with said genetic perturbations.
Regardless, it's clear that those cells have taken over by your later sample either through advantageous clonal outgrowth conferred probably at least in part by those CNAs or the sample just has a much greater proportion of malignant cells.