Question

What is the best practice for the analysis of single-cell RNA-seq of mouse and human mixed cells?

0

Entering edit mode

19 months ago

Dan ▴ 180

My samples are patient-derived xenografts that contain both human epithelial cancer cells and mouse fibroblast/blood cells. I run cellranger count using the human and mouse combined reference genome, which was downloaded from 10xgenomics (https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-and-mm10-2020-A.tar.gz). After Leiden clustering, the human and mouse cells can be separated by the marker genes found with scanpy.tl.rank_genes_groups. Then the data can be separated into human cells and mouse cells. The mouse genes in human cells were removed and the raw data of human cells can be re-clustered.
I am wondering if is this the correct practice for the analysis.

Is it possible to filter out the human cell from the fastq data and run cellranger count using only human reference genome?

Thanks.

single-cell RNA-seq • 1.7k views

ADD COMMENT • link 19 months ago by Dan ▴ 180

2

Entering edit mode

One option is to use Sargasso (https://github.com/biomedicalinformaticsgroup/Sargasso) in order to split the raw FASTQ reads based on their respective species. I have used this program before, but I am unsure of its usage in scRNA-seq.

ADD REPLY • link 19 months ago by Kevin Blighe 87k

score 2 · Accepted Answer · 2022-09-05

2

Entering edit mode

19 months ago

dsull ★ 5.8k

One thing I like doing is mapping raw FASTQs to combined human+mouse reference to figure out which cell barcodes are human and which cell barcodes are mouse (e.g. if >75% of reads/UMIs belonging to a given barcode are of human origin, it's probably human and not mouse).

Then, I map the original raw FASTQ to human-only reference and throw away the cell barcodes that were of non-human origin from the previous step.

For scRNA-seq, I like this approach of deciding at the cell barcode-level what's human and what's mouse.

ADD COMMENT • link 19 months ago by dsull ★ 5.8k

0

Entering edit mode

Thanks for your suggestion. Can you please let me know how to split the fastq files based on the cell barcodes? Thanks

ADD REPLY • link 19 months ago by Dan ▴ 180

score 2 · Accepted Answer · 2022-09-08

2

Entering edit mode

19 months ago

Dan ▴ 180

To split FASTQ based on cell barcodes, I tried running subset-bam:

https://github.com/10XGenomics/subset-bam

and then run bamtofastq:

https://github.com/10XGenomics/bamtofastq

and then map the fastq files to human-only and mouse-only references.

ADD COMMENT • link 19 months ago by Dan ▴ 180