My samples are patient-derived xenografts that contain both human epithelial cancer cells and mouse fibroblast/blood cells. I run
cellranger count using the human and mouse combined reference genome, which was downloaded from 10xgenomics (https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-and-mm10-2020-A.tar.gz). After Leiden clustering, the human and mouse cells can be separated by the marker genes found with
scanpy.tl.rank_genes_groups. Then the data can be separated into human cells and mouse cells. The mouse genes in human cells were removed and the raw data of human cells can be re-clustered.
I am wondering if is this the correct practice for the analysis.
Is it possible to filter out the human cell from the
fastq data and run
cellranger count using only human reference genome?
One option is to use Sargasso (https://github.com/biomedicalinformaticsgroup/Sargasso) in order to split the raw FASTQ reads based on their respective species. I have used this program before, but I am unsure of its usage in scRNA-seq.