Question

RNA-seq: How to know which paired reads come from the same original fragment?

1

Entering edit mode

7.1 years ago

psm ▴ 170

In paired-end RNA sequencing, can it definitively be known which two paired reads come from the same fragment, or is it inferred based on the distance between the reads?

RNA-Seq Assembly • 7.2k views

ADD COMMENT • link updated 4.0 years ago by wang-yanfang • 0 • written 7.1 years ago by psm ▴ 170

score 2 · Answer 1 · 2018-05-29

2

Entering edit mode

7.1 years ago

Tm ★ 1.1k

When you go for pair-end library preparation from RNA fragments followed by its sequencing, it results in generation of 2 files, one read generated from forward sequencing (mostly denoted as R1) and second read is generated from reverse sequencing (mostly denoted as R2)

That means for each of your fragments generated during library preparation, there is 1 forward sequence in R1 and its corresponding reverse sequence in R2. So, all the reads generated from the sequencer are properly paired only having same read name/header.

Here, you can see two reads, one from R1 file and second from R2 file. I.e they are representing same RNA fragment and thus they have same read name except part highlighted in red and blue which starts with 1 and 2 respectively, indicating 1st read is from R1 and 2nd read is from R2 file

ADD COMMENT • link 7.1 years ago by Tm ★ 1.1k

2

Entering edit mode

In addition to that, from a technical side: Both the forward and reverse read are detected from the exact same spot on the flow cell, so that they are assigned the same name. Check this video for details on the Illumina process.

ADD REPLY • link 7.1 years ago by ATpoint 88k

0

Entering edit mode

Thank you - this is exactly what I wanted to know.

ADD REPLY • link 6.8 years ago by psm ▴ 170

0

Entering edit mode

You're very welcome ;-)

ADD REPLY • link 6.8 years ago by ATpoint 88k

0

Entering edit mode

Hey, may I ask a question following your answere? Should these two reads be reverse complementary with each other? But I didn't see they are reverse complementary. Could you help me with this?

ADD REPLY • link 4.0 years ago by wang-yanfang • 0

score 0 · Answer 2 · 2018-05-29

0

Entering edit mode

7.1 years ago

swbarnes2 15k

You tell from the shared read name.

ADD COMMENT • link 7.1 years ago by swbarnes2 15k

1

Entering edit mode

Thanks for the answer. To clarify my (perhaps poorly-worded) question, how is it determined which reads are paired? Through a bioinformatics algorithm, or perhaps through some feature of the adaptor/ flowcell? Is it possible that two reads are incorrectly paired? Apologies for my ignorance - I had trouble finding the answer elsewhere.

ADD REPLY • link 7.1 years ago by psm ▴ 170

1

Entering edit mode

The instrument knows the xy coordinates of every cluster. Reads from the same xy coordinates are paired. If two clusters overlap such that the software can't distinguish them, it will throw them out. So no, the software can't mix up read pairs. And of course, the software knows the reads are paired long before any mapping coordinates are known.