Question

Demultiplexing Conceptual Issue

0

Entering edit mode

6.7 years ago

asyndeton17 ▴ 40

Hi,

I recently received Drop-seq data back. However, I'm having a little trouble interpreting the raw data format. I have two reads, one is 20 bp, which is what I'm assuming to be the cell barcode concatenated with the molecular barcode (UMI). The other is 62 bp, which is the one I'm having trouble understanding. Are the first 12 bp the cell barcode again? Also, what does it mean to de demultiplex? Isn't each cell supposed to have a unique barcode? If so, then why does demultiplexing rely on a given barcode for each condition?

Thanks

scRNA-seq RNA-Seq barcode index • 1.7k views

ADD COMMENT • link 6.7 years ago by asyndeton17 ▴ 40

score 0 · Answer 1 · 2017-08-03

0

Entering edit mode

6.7 years ago

GenoMax 141k

Take a look at the core computational flow image at the McCarroll lab web site. As long as you are using standard protocol: Cell barcode + UMI + 50 bp cDNA

ADD COMMENT • link 6.7 years ago by GenoMax 141k

0

Entering edit mode

I am using the standard protocol. You're solution suggests a 70 bp read, but I have 62. I am more concerned with demultiplexing, however.

ADD REPLY • link 6.7 years ago by asyndeton17 ▴ 40

0

Entering edit mode

Do you have just two files? Can you post a few example fastq headers/reads from the files? It seems to me that your first file may have the barcode+UMI and the second file actual sequence data+UMI.

ADD REPLY • link 6.7 years ago by GenoMax 141k

0

Entering edit mode

Yes, only two files per condition. Looking back at them, I think you're right. Does that mean these are already demultiplexed?

ADD REPLY • link 6.7 years ago by asyndeton17 ▴ 40

0

Entering edit mode

Possibly not. Puzzling why they are split like that. If you look at that image again, once you group the reads by the barcode then you need to count the unique UMI for each gene. May be best to confirm with whoever gave you the data.

ADD REPLY • link 6.7 years ago by GenoMax 141k