Duplication in DNA-seq vs RNA-seq
1
0
Entering edit mode
2.4 years ago
samc • 0

Why is duplication a greater concern in DNA-seq than RNA-seq?

Duplication • 1.1k views
ADD COMMENT
0
Entering edit mode
2.4 years ago

Because in RNA-seq, it is normal to have what we call biological duplicates. These are reads with exactly the same sequence whose origin takes root in the original RNA extract, not in a PCR/cluster artifact. It happens because in a cell, some RNA molecules are much more abundant than others. They are present in so many copies that the probability to sequence them multiple times is quite high. Such natural, biological duplication is more uncommon in whole genome sequencing experiments (DNA-seq), where each chromosome is present in a single copy (or a few copies, depending on the ploïdy) in the cells. Read duplication in such settings usually represents PCR duplicate or optical duplicate, two kinds of technical artefacts that are of bigger concern than normal biological duplicates.

Note that extra-deep coverage of any sequencing experiment (DNA-seq included) will tend to generate more natural duplicates, simply because of signal saturation.

ADD COMMENT
1
Entering edit mode

I'd say its not that PCR duplication is less of a concern in RNA-seq than DNA-seq, its just harder to identify it (that is identify duplication that is definately PCR rather than biological), there is little we can do about when it is present.

ADD REPLY
0
Entering edit mode

I agree that optical and PCR duplicates are always concerning regardless of the -seq method. But duplicates in general, less so. For instance, in FASTQC, the duplicate sequence metric always "fail" and raises a flag with RNA-seq. Despite the flag, one should not be concerned about that, because it is normal.

I interpreted the OP question as related to duplicate sequences in general, but you are right to clarify.

ADD REPLY

Login before adding your answer.

Traffic: 1967 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6