Question

5scRNA (GEX, Cell surface, TCR) fastq demultiplexing sub-sequent analysis

0

Entering edit mode

3.1 years ago

theodore ▴ 90

Hi all,

I would like to analyze a 5´scRNA 10xgenomics based project where we used the totalSeq hashTaqs C0251,C0252,C0253,C0254 (from bioledgents) to multiplex "4 samples in one", and then label them with the TotalSeqC Human Universal Coctail. the provided sequences for the multiplexing hashtags (C0251,C0252,C0253,C0254) are:

GTCAACTCTTTAGCG
TGATGGCCTATTGGG
TTCCGCCTCTCTTTG
AGTAAGTTCAGCGTA

So in the end we ended up with 3 different "types" of fastqs, from 3 Libs (GEX,Cell surface and TCR). What is the best way to get the appropriate tables for analysis using seurat? Will I demultiplex further the samples in seurat? the first demultiplexing step is based on 10xgenomics indexes.

Also I was wondering how it is possible to extract the necessary tables or h5 objects (if 10xgenomics cellranger is used) to then load them into seurat. I assume that I will need 3 tables, a GEX table an HTO table and one for the Cell surface. Finally, how do I load the files to seurat? Is there a proper guide for the above?

Thank you all in advance

RNA-Seq hashtag seurat R demultiplexing • 2.0k views

ADD COMMENT • link updated 3.0 years ago by nux ▴ 20 • written 3.1 years ago by theodore ▴ 90

0

Entering edit mode

https://satijalab.org/seurat/articles/get_started.html

ADD REPLY • link 3.0 years ago by nux ▴ 20

0

Entering edit mode

Hi there, I am familiar with this page but I can not find a description on how to produce the initial matrices to feed into the seurat pipeline. Also, please not that this is a 5' scRNA hash tag library (multiplexed) with feature barcoding (to easily identify the subpopulations) and on top V(D)J.

ADD REPLY • link 3.0 years ago by theodore ▴ 90

score 1 · Answer 1 · 2021-12-24

This is a very good point - The TCR library and contig file won't have the hashing information. The only way I have been able to "recover" the TCR is to build a single-cell object with the RNA data (which will get you the sample-level information) and then use the barcodes and sample-level information to filter specific TCRs. As long as you are doing this for a single run, there should be no issues with duplicate barcodes. If you are using a single-cell RNA object that has multiple runs, it could be very difficult as there will be an overlap in barcodes.

I have written a function called createHTOContigList() for scRepertoire that will do this automatically. Let me know if you have any other questions or suggestions.

Nick