The Difference Between Directional And Nondirectional Sequencing
10.5 years ago
Assa Yeroslaviz ★ 1.7k

Hi,

can someone please give me a hint, where I can find a good summary/explanation about the differences between directional and non-directional sequencing. I would like to understand the methods better as well as to know what are the reads resulted from these experiments are showing me.

I always thought I know the difference which is quite straightforward, but lately I was asked to try and explain it. Somehow I got into a problem explaining, whether the reads are a copy of the sense or the anti-sense strand.

The problem began with an experiment of non-directional single-end sequencing. We wanted to try and find those reads which match the plus strand - is it possible?

If someone can better explain it than me, I will appreciate the help.

Assa

10.5 years ago
Ryan Dale 5.0k

I would argue that this is an important bioinformatics question, since you need to understand the source of the data in a FASTQ or BAM file before you can effectively analyze it.

I never fully understood it until I sat down to work it all out. An ASCII example . . .

## Unstranded protocol

Here's the original fragment (say, a 200bp long piece of mRNA):

# 5' ----------- 3'


Adapters are the same on both 5' and 3' sides for the unstranded protocol. Here's my notation for adapters:

#            5' adapter: ====
# 5' adapter, rev. comp: oooo


So the fragment with adapters ligated looks like this:

# adapter        adapter
# 5' ====-----------oooo 3'


And here's the cDNA; you might imagine them anchored to a flow cell on the left-hand side:

# 5' ====-----------oooo 3'
# 3' oooo-----------==== 5'


Sequencing primer (SP->) is also the 3' end of the 5' adapter. So it will sit down on the revcomped 5' adapter, "oooo":

#     3' ----------- 5'   <- sequenced read; reported in FASTQ as 5' to 3';
#                            reverse complement of the original
#                   <-SP
# 5' ====-----------oooo 3'  <- was RNA from the (+) strand
# 3' oooo-----------==== 5'  <- complement
#    SP->
#      5'----------- 3' <- sequenced read; reported in FASTQ as 5' to 3';
#                          original sequence


Since the sequencing primer can start from both 5' ends, you can sequence the original sequence or its reverse complement -- no way to tell which is which.

## Stranded protocol

Different adapters are used for either end in the stranded (directional) protocol:

#              5' Adapter: ====
# Complement to 5' adapter: oooo

# Complement to 3' adapter: ::::


The key here is that you're putting a different adapter on each side:

# 5' adapter    3' adapter
# 5' ====-----------++++ 3'


cDNA:

# 5'====-----------++++3'  <- was RNA from the (+) strand
# 3'oooo-----------::::5'  <- complement


Sequencing primer is still the 3' end of the 5' adapter, so the only place it will sit down is on the "oooo". And the only place this occurs is on the 3' end of the original fragment:

# 5'====-----------++++3'  <- was RNA from the (+) strand
# 3'oooo-----------::::5'  <- complement
#   SP->
#    5' ----------- 3' <- sequenced read, reported in FASTQ as 5' to 3'.
#                         This sequence is the same as the original RNA
#                         sequence.


Since there's only one place for the sequencing primer to start, you know what strand the final read came from; the 5'-to-3' sequence reported in the FASTQ file matches the 5'-to-3' sequence of the original fragment.

Thanks Ryan Dale for the very clear explanation. I was under the impression that the fastq file before trimming the adaptors must be used to filter out the reads coming from which adaptor. Your diagrams have answered my doubt.

Thanks again!!

10.5 years ago
Ahdf-Lell-Kocks ★ 1.6k

The difference is that the with directional sequence one knows what was the original direction of the biological material for every read. Mostly applied to RNA-seq, this means the RNA sequence as extracted from the biological material is sequenced in fragments, and if this is done with a directional protocol, each fragment in the FASTQ file is in the original 5'->3' direction as that bit in the RNA molecule. Non-directional protocols amplify the material during the procedure in a way that loses the directionality of each resulting read.

10.5 years ago

One difference in terms of bioinformatics is that if you have a strand-specific protocol, you will be able to resolve two overlapping transcripts that are transcribed on different strands. This would not be possible with a non-directional protocol.

I have not worked much on analyzing directional sequence data, but I did look at one strand-specific SOLiD RNA-seq data set some time back. There, the BioScope software used for the secondary analysis had written the alignments for the reads in two separate files: one for the '+' strand and one for the '-' strand. This made it very easy to e g visualize the alignments on the genome in different colors depending on the direction.

EDIT: And you would only be able to answer your question (which reads map to the + strand) using a strand specific protocol.