Hi, I am new to scRNA seq and I'd need help with this. I have a rat sample containing a human xenograft (around 5% of total cells are human). I am only interested in the human cells. Talking to 10x reps, they recommended this approach
1) I created a rat+human genome, mapped the 10x output to the human+ rat genome and to the human genome only.
2) I should subtract from the human only mapped output, the cells that previously mapped to rat, based on their barcode.
I am not quite sure how to do this, how to eliminate cells based on the barcode. Any help would be highly appreciated!
There is a piece of software called XenoCell developed by the same team that fixed xenome (well, one version of "fixed") and produced v1.0.1r - this version was included in the initial release of XenoCell. This tool follows almost the same approach as described below, except it uses seqtk to pick reads to be included instead of using BBSplit to pick reads to be excluded. I've taken over maintaining the tool since the original team has moved on and am working on new features/enhancements. For example, I've switched to the cancerit's xenome which generates a summary once it's done running, as opposed to the 1.0.1r which does not do that.
Original answer:
You're going to need to dig deep into the output files from CellRanger count to understand how to exclude cells by barcode. I also work on 10X platforms + xenografts but I follow a different approach. It kinda intuitively makes sense to me but I'd also love others' takes on it.
Use software like Xenome or BBSplit to split the biological reads (R2 for scRNAseq, R1 & R3 for scATACseq etc.) to split these into reads that fall into human, mouse and other ambiguous categories (if applicable).
Using BBMap's filterbyname.sh, extract non-transcriptome reads (I1, I2, R1, R2 etc. as applicable) to match the above split transcriptome reads so you have matching I and R read files for each organism.
Use the sets generated above to map to the 10X organism-specific index.