duplicate reads in RNAseq datasets
1
0
Entering edit mode
7 weeks ago
Bioinfonext ▴ 470

Hi, I was doing multiqc on raw rnaseq datasets, it is showing higher level of read duplication. Do I need to take any step for this datasets before processing for read quantification. I am using trimmommatic to remove low quality read and adapter sequences but not sure if I need to take any other steps. Multiqc file attached. Many thanks.enter image description here

RNAseq R • 518 views
ADD COMMENT
1
Entering edit mode
7 weeks ago
GenoMax 153k

Is this total RNAseq or mRNAseq data? If totalRNAseq, it may be just rRNA.

Some duplication is expected in RNAseq since there will multiple copies of transcripts for some of the genes. In case the amount of starting material was limited, and if the person making the libraries went a bit overboard with PCR cycles to generate enough material, that can lead to PCR duplicates. It would be difficult to identify that issue for certain unless there were UMI's.

You may want to move forward with the analysis and see how things go.

ADD COMMENT
0
Entering edit mode

Thanks for your response. This is total RNAseq data.

ADD REPLY
0
Entering edit mode

Just try aligning the reads and see what happens — and you can check which transcripts are most abundant. You might also consider aligning to rRNA as well, then tossing out those reads, and running QC again.

ADD REPLY

Login before adding your answer.

Traffic: 6623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6