Should trimming of contaminants be done if the assembled sequence contains the vector sequences same as that of the reference genome
16 months ago
rajeefa ▴ 10


I have E.coli samples paired-end data with read length 150bp obtained through DNB sequencing. Since i found the strain of my samples from Strain Seeker I used them as my reference genome and assembled the sequence. Now when I do Vecscreen I get lots of contaminants and vectors including phage , plasmids etc. The point at which Iam getting confused is that even my reference strain from NCBI database also gives same Vecscreen results.I already tried trimming with Seqclean and there is a drastic decrease in the number of bases (like from 4.9M bases to 36973 bases).So my doubt is that should I discard the Vecscreen result and move further to the annotation and downstream analysis.

Iam a beginner in WGS data analysis and sorry if this is a naive question. It would be really helpful if I get a solution since my analysis is now stuck at this point.

assembly vecscreen trimming WGS contaminants • 352 views

