Question

Remove duplication reads for RNAseq?

0

Entering edit mode

4.4 years ago

ilwook.kim.1982 • 0

Hi everyone, I'm a very newbie in Bioinformatics. I have a question about the treatment of duplication for RNAseq. Should I remove duplication (natural and artificial)?

I tried to use dupRadar for duplication in R. With one sample, It shows me as below

>fit <- duprateExpFit(DupMat=dm)

>cat("duprate at low read counts: ",fit$intercept,"\n",
    "progression of the duplication rate: ",fit$slope,fill=TRUE)

> duprate at low read counts:  1.159792,  progression of the duplication rate:  2.319875.

I don't know what is meaning exactly. Please let me know whether I should remove duplication for further analysis or not.

Thanks a lot in advance.

RNA-Seq • 981 views

ADD COMMENT • link updated 4.4 years ago by GenoMax 141k • written 4.4 years ago by ilwook.kim.1982 • 0

score 2 · Answer 1 · 2019-11-28

2

Entering edit mode

4.4 years ago

GenoMax 141k

Should I remove duplication (natural and artificial)?

No you should not remove duplicates. Unless you had used UMI's you can't really determine if identical reads are PCR or natural duplicates either.

More here: Duplicated Reads In Rna-Seq Experiment
Should We Remove Duplicated Reads In Rna-Seq ?

ADD COMMENT • link 4.4 years ago by GenoMax 141k

0

Entering edit mode

Thank you so much for your answer. But still, I'm a bit confused because of the discussion of links. It looks like depending on the condition. depRadar tool may solve the problem? please let me know.

ADD REPLY • link 4.4 years ago by ilwook.kim.1982 • 0

0

Entering edit mode

Even if there is a problem (with PCR duplicates), it is not possible to solve it cleanly unless you were using unique molecular indexes (UMI, which adds cost and complexity). UMI's are used to label individual RNA molecules before they are converted to cDNA and amplified. dupRadar will remove duplicates but you would be throwing away good counts and may skew your downstream analysis.