Question: High Sequence Duplication levels in FastQC
5.2 years ago
EVR570 wrote:


I am new to NGS analysis. I have RNAseq samples generated from Illumina platform. When the samples loaded onto FastQC , I noticed that there is high percentage Sequence Duplication level (> 70%) for all samples but the per base sequence quality for all samples are really good. How should I approach with these data. is this normal for illumina samples. Any guidance would be highly appreciated.

Thanks in advance 


5.2 years ago
New York
Sam3.2k wrote:

For RNA Seq, it is normal to see a high percentage of sequence duplication level. That is because the inherit nature of the RNA Sequencing is to count the depth of coverage of each gene. If you have a high coverage gene, it is likely you will encounter duplication.

As for exome sequencing, you would prefer a uniform coverage, therefore duplication might be a problem

edit: Actually someone else has asked a similar question before High Duplication Rate of Mapped In RNA-seq

