Question: Sequence Duplication levels
5 weeks ago
nageshprabhu.k

Hello all,

I have a question regarding duplication levels shown in MultiQC report. The raw data went through HTStream for preprocessing and MultiQC was used to check. In the report I saw that when I ran multiQC on 3 different lanes of the same sample it had about 40% duplication for all three lanes (L001, L002, L003 had 40-41%) but the same sample had about 60% duplication when all 3 lanes were combined. Can anyone explain the reason for this?

Any help is appreciated! Thank you.

5 weeks ago
United States
genomax

As the amount of data increased it is possible that sequences added to the pool led to discovery of more duplicates (that were originally in separate lanes).

