2
0
Entering edit mode
8.1 years ago
szc0049 • 0

Hi,

I used trimmomatics to trim my Illumina Hiseq reads with a list that I downloaded from here.

But after I assembled the trimmed reads, I tried to upload the assembly to the TSA database in NCBI, they gave me the error saying that my sequence is contaminated by primer sequences. I found one of the contamination sources using vector screen, which is CCCTACACGACGCTCTTCCGATCT. But this sequence is actually contained in one of the adapter sequences in the list:

>TruSeq_Universal_Adapter
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT


So my question is why the sequence is not trimmed off by trimmomatics?

Is there anyway to remove the contamination from the assembly. So I don't have to go back to reassemble the sequences?

next-gen-sequencing RNA-Seq • 5.8k views
0
Entering edit mode
8.1 years ago

Trimming does not look for all subsequences of the adapter. It only detects the adapters from their start and then continuing towards the end (at variable lengths). Normally this is the way adapters show up. In your case it seems more oddities are present.

0
Entering edit mode

Is there any script or program to remove the contamination from the assembly?

0
Entering edit mode

Your assembly is very likely incorrect.

The adapter contamination in two reads will provide a common region that may lead the assembler to join these two into a single contig. Thus potentially connecting regions that are not actually adjacent.

You will need to filter out the reads that contain contaminants by aligning the reads against these contaminants, removing reads that aligned and reassembling with the remaining reads.

0
Entering edit mode
11 months ago
Jiacheng ▴ 30

Trimming accuracy varies in different trimmers. I'd recommend atria to determine and trim the adapter sequences. It is a newly-published cutting-edge trimmer with exceptional precision and speed.

To find a concise trimming benchmarks, you can click here.

You can also find more comprehensive trimming benchmark at Atria's paper.

0
Entering edit mode

To check if your trimmer is behaving as expected, it's good to visualise the output. Trimviz can do that if you just feed it your pre-trimmed and post-trimmed fastq files. https://github.com/MonashBioinformaticsPlatform/trimviz