comparing 2 datasets, one with high PCR duplicates

0

Entering edit mode

4.1 years ago

wiscoyogi ▴ 40

I have two datasets that I want to compare.

My problem is that in one of the datasets, there were a lot of PCR duplicates, so the number of unique molecules are particularly low and there are fewer overall counts. The count values that I’m then getting are making conclusions from the biology hard with my other dataset that did not have PCR duplicates.

Do you have any suggestions for what transformations are available so that I can make a fair comparison between the two datasets?

PCR duplicates data standardization RNA-Seq • 692 views

ADD COMMENT • link 4.1 years ago by wiscoyogi ▴ 40

0

Entering edit mode

How many replicates are in these datasets?

ADD REPLY • link 4.1 years ago by ATpoint 82k

0

Entering edit mode

I had 16 biological replicates and there were no technical replicates.

ADD REPLY • link 4.1 years ago by wiscoyogi ▴ 40

0

Entering edit mode

As said you typically do not care about duplicates in RNA-seq. I would run it through the DGE pipeline and see if results are reasonable. Also check by PCA if things look good.

ADD REPLY • link 4.1 years ago by ATpoint 82k

0

Entering edit mode

there were a lot of PCR duplicates

How did you conclude that? One can't be absolutely certain about those unless you had UMI's.