Forum:Differences between SMART-seq2, SMART-seq3, and 10x
1
0
Entering edit mode
14 days ago
hamarillo ▴ 10

Hi everyone,

I've recently started analyzing single-cell RNA-seq data (with FASTQ files as a starting point) and so far I have used 10x genomics data from their website.

Now, I'm interested in using data generated by other protocols, specifically SMART, because it is the most used full-length protocol (the two main paradigms are tag-based like 10x and full length). However, I'm having trouble understanding the raw data and I figured that it would be worth discussing the differences between FASTQ files from 10x and SMART-seq. Both methods are sequenced in Illumina sequencers, which depending on the model, yield a different number of files, but it's always one set of files. What about SMART-seq? is that the protocol where there's one set of files for each cell?

To further complicate matters, I understand that full-length protocols (SMART-seq2) -unlike tag-based protocols- do not support UMIs, but SMART-seq3 does use UMIs and I had the idea (I read it in some paper) that when you are sequencing full-length transcripts having UMIs is really not a factor that changes anything. So how does the analysis between SMART-seq2 and SMART-seq3 change to account for this?

Thank you!

cell smartseq UMIs single 10x • 317 views
ADD COMMENT
5
Entering edit mode
14 days ago
dsull ★ 2.3k

Smart-seq data, unlike 10X data, is oftentimes deposited in a demultiplexed format, meaning each cell gets one set of FASTQ files. In 10X data, yes, there's just one set of FASTQ files but somewhere within the FASTQ files is a barcode sequence that can help you resolve each individual cell.

The advantage of Smart-seq is you get better coverage across transcripts (for 10X, you're only sequencing the 3' end which can make isoform resolution analysis difficult in many cases). Also Smart-seq sequences fewer cells so each cell can get higher sequencing depth (i.e. more reads per cell). The advantage of 10X is, as you noted, the UMIs.

Smart-seq3 is a newer version of Smart-seq that indeed uses UMIs. Basically, you're going to have one set of FASTQ files (there's no demultiplexing) but you can use barcodes to resolve individual cells and some of the reads will contain UMIs and other reads will not contain UMIs. The non-UMI containing reads give you better coverage across transcripts (at the expense of not having UMIs).

"when you are sequencing full-length transcripts having UMIs is really not a factor that changes anything"

This is not really true. The purpose of UMIs is to account for amplification bias. In bulk RNA-seq, amplification bias is not really present but in single-cell RNA seq, it is something to be concerned about because you have lower amounts of starting material (which requires many rounds of PCR, and that's where amplification bias comes in).

In any case, for Smart-seq3, you're going to have your UMI-containing reads (where you can collapse UMIs and analyze like you would 10X data) and you're going to have your non-UMI-containing reads (where you can't collapse UMIs and instead have to proceed with your analysis starting from raw read counts). Again, both types of reads give you different types of information (one has better length coverage and one accounts for amplification bias better).

ADD COMMENT
0
Entering edit mode

Thank you, that was a great answer. So before smart-seq3 the data was inflated? since without UMIs there was no way to correct for the PCR duplicates

ADD REPLY
1
Entering edit mode

Correct, smart-seq2 doesn't have UMIs so there was no way to correct for PCR bias. This was why smart-seq3 was developed.

As for how big of a difference PCR bias makes, that's a whole other discussion entirely. All RNAseq library preps introduce many sources of technical biases (PCR, length, coverage, capture bias, sequence-specific biases, etc.) and how these various biases affect downstream analyses is an entire field of research on its own!

ADD REPLY

Login before adding your answer.

Traffic: 2313 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6