Issue with cDNA PCR artifact in RNA-Seq dataset
1
0
Entering edit mode
5 months ago

I have a large RNA-Seq dataset that I am working on. In half of my samples approximately 2-5% of reads are duplicates of this sequence "AAGCAGTGGTATCAACGCAGAGTACTTTTTTTTTTTTTTTTTTTTTTTTTTT". I understand this sequence is a primer from the cDNA step of RNA-Seq data preperation. This leads to a clear divide between my samples, such that in the MDS space, one axis divides samples by type and the other divides samples by whether or not they have this artifact.

How should I deal with this? Can I completely remove this sequence from my files? How should I go about that?

PCR RNA-seq artifact cDNA • 244 views
ADD COMMENT
0
Entering edit mode
5 months ago

You can remove such sequences with multiple tools: BBDuk for example. However, I do not think that the divide will emerge just because of that. Rather, it may be low-input/complexity samples that happen to suffer from this artifact the most.

ADD COMMENT

Login before adding your answer.

Traffic: 3340 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6