Sequencing And Strands
3
3
Entering edit mode
10.3 years ago

I originally asked: Are sequencing methods consistent in how their report which is the "forward" strand and which is the "reverse" strand or this in an arbitrary choice?

It has been pointed out that this question is unclear, since it could mean both: (i) NGS sequencing and the strand from which a given read might come from or (ii) the designation of one strand as 'forward' and one strand as 'reverse' in the description of a double-stranded DNA. I meant to ask for (ii) and although, in retrospective, I can see that (i) is likely interpretation given how I phrased the question.

Now I actually found two other biostar questions which are similar to this: 1) http://www.biostars.org/post/show/3423/forward-and-reverse-strand-conventions/ and 2) http://www.biostars.org/post/show/3908/conventions-for-designating-forward-and-reverse-strands/

I am still a little confused nevertheless. An highly up-voted answer to 1) says "The designation is arbitrary" while 2) seems to indicate that there is indeed a convention for human chromosomes. I am still wondering, though, what then about the circular bacterial chromosome. Is the designation arbitrary in that case?

sequencing strand dna • 11k views
1
Entering edit mode

I don't believe that bacterial chromosomes have any biologically significant "polarizing" features that would give any kind of inherent directionality to the origin of replication. As far as I'm aware, replication begins at the origin of replication symmetrically and proceeds in both directions along the chromosome. So any kind of strand designation would necessarily be arbitrary, because there is not inherent strandedness to a circular chromosome. For linear chromosomes, the choice also seems to be arbitrary. In theory things could be chosen so that the plus strand always starts closest to the centromere, and this is the case in all the human chromosomes, but as you have noticed, it is not the case for other organisms.

0
Entering edit mode

There is no contradiction between saying that there is a convention and that the conventional designation is arbitrary.

10
Entering edit mode
10.3 years ago
Ryan Thompson ★ 3.6k

Do you remember the concept of "frames of reference" from physics? In physics class, when you say something has a velocity of 5 meters per second, you really need to specify relative to what. (Luckily, there's usually a conveniently-located planet in the vicinity of the physics experiment that can be used as a frame of reference).

Here, when you talk about things like plus/minus strand, sense/antisense strand, forward/reverse strand, etc., you must pay attention to the precise meanings in terms of what the strandedness is relative to. For example, with Illumina paired-end sequencing, you get two reads: commonly called "forward" and "reverse" reads, or even just "Read 1" and "Read 2". In this case, the terms "forward" and "reverse" refer only to the relative orientation of each read to the other, and do not in themselves imply anything about their orientation relative to, say, the coding strand of the original RNA molecule in an RNA-seq experiment. When you do a strand-specific protocol, depending on the protocol, these reads will now have an assigned strandedness relative to the original RNA molecule. For example, a particular protocol might be designed that the reverse read just happens to represent the actual sequence of the RNA, while the forward read represents the reverse-complement sequence.

So, to summarize, the pair of terms forward/reverse are quite generic and their meaning is highly dependant on context. The pairs plus/minus and sense/antisense do have well-defined biological meanings, but you should be aware that they are often abused to mean the same thing as forward/reverse (frequently, but not always, by non-biologists), so regardless of the terms used you should probably ask the person you are working with to draw a diagram. If they can't do it then they are just as lost as you are with respect to the strandedness of your data.

And I haven' t even gotten to relative positioning of reads, or circularization-based mate-pair sequencing.

1
Entering edit mode
10.3 years ago

[Out of topic given the new details on this question]

In order to get strand information, you will have to generate a library with a strand-specific protocol. Most RNA-Seq protocols are not strand-specific. Here is a paper comparing different strand-specific protocols: http://www.nature.com/nmeth/journal/v7/n9/abs/nmeth.1491.html

0
Entering edit mode
10.3 years ago
Vikas Bansal ★ 2.4k

[After final editing by OP, this answer is also out of topic.]

As I misunderstood the question-

I just want to add that you need to prepare library using strand specific protocol(as Leonor said) if you want strand information. Related to this I found this post on biology.stackexchange.

"I am not sure if I understood your question fully, but I will try. The choice is not arbitrary. When you sequence DNA, you will get reads (may 100bp, 150bp or longer in case of 454 sequencer) and then you will map these reads to reference genome (eg using BWA or Bowtie etc etc etc.) and it will tell you the strand on which your read mapped.

Note: I Have not mentioned De Novo assembly because I am not sure about that."

0
Entering edit mode

If you don't have a strand-specific protocol, the mapping direction does not give you any information about the 'real' direction!

0
Entering edit mode

So it means, lets say, I have sequenced human genome and I mapped my reads to reference genome and I cannot say on which strand my read mapped???

0
Entering edit mode

If you do not have a strand-specific library, no, you can not. In other words, you will have a read, which of course maps to one strand or another, but this read comes from the sequencing of an RNA library which has lost its strand information, i.e. both strands of an RNA are present. In another question (http://biostars.org/post/show/46757/learn-sequencing-practically) you say you are interested in learning sequencing, and several persons suggested you read about library preparation. I could not agree more, this is where the real methodological choices are made.

1
Entering edit mode

Ahh! I misunderstood the question. I thought about the strand where our read mapped but I think you mean the actual direction of read from where it originally came from???

0
Entering edit mode

Yes, this is exactly what I meant. My apologies for formulating an unclear question. I found, http://www.biostars.org/post/show/3423/forward-and-reverse-strand-conventions/ which says that the choice of strand is arbitrary but I was unsure if this is really the case??

1
Entering edit mode

Could you edit your question to make it more precise? I don't understand if you are talking about (i) NGS sequencing and the strand from which a given read might come from or (ii) the designation of one strand as 'forward' and one strand as 'reverse' in the description of a double-stranded DNA.

0
Entering edit mode

Edited my answer. And yes, if you are not using strand specific library then you cannot tell the original strand of read (from which strand read comes originally).