Removing overrepresented primer sequences

0

Entering edit mode

5.0 years ago

Ml6237 • 0

Hello all,

I have paired end RNA seq data. Sequenced read files provided were "cleaned" reads (adapters and low quality reads removed). In one pair of files, FastQC identified a warning for an over-represented RNA-seq primer sequence (0.105%, 96% over 29bp) in just the reverse read of one pair. So I assume this would need to be trimmed and was hoping for some advice please.

Was going to trim these using Trimmomatic (paired end mode) by adding the primer sequence to adapters fasta file.

So an example adapter fasta file would look something like this, e.g.:

>PrefixPE/1
TACACTCT....
>PrefixPE/2
GTGACTG.....
>RNA-Seq_PCR_Primer
CAAGC.....

For the sequence for "RNA-Seq PCR Primer" to be trimmed would you:

1) add just the overrepresented sequence identified by FastQC, 50bp long, to the adapters fasta file?

2) find the full primer sequence, which is 100BP, and add this to the adapters fasta? The primer sequence can be found in the FastQ contamination_list.txt files.

Thanks in advance.

RNA-Seq • 1.3k views

ADD COMMENT • link 5.0 years ago by Ml6237 • 0

Login before adding your answer.