Does this adapter content from fastqc to warrant trimming prior to alignment?
1
0
Entering edit mode
13 hours ago
curious ▴ 890

Hi I am looking to align some human short read WGS data with BWA-MEM prior to variant calling. I have 62 samples, each has 8 pairs of fastqs (992 fastqs total). I ran fastqc/multiqc on all of them and got this result from multiqc:

enter image description here

For comparison here is an individual fastqc:

enter image description here

Spot checking a handful of individual fastqc reports it seems the flat universal adapter line hovering around 2ish percent and that poly a line creeping up over the length of the read is pretty typical, does this require trimming prior to alignment?

According to this the threshold for warning is 5% and failure is 10%, but just wondering what the general practice is

Thanks!

fastqc bwa-mem multiqc • 839 views
ADD COMMENT
0
Entering edit mode

I would trim it. Two percent is not much but with WGS, say you have 400mio reads per sample or so, it's still millions of reads that theoretically could contaminate variant calls. Do it once, properly, and never care about it again. That having said, we are utterly flattered with a free-to-use powerful HPC at our university, so all we do is waiting for this job to complete. If you're on a bidget, say need to pay for HPC or a cloud service, you might want to skip it, but I would still feel safer trimming.

ADD REPLY
1
Entering edit mode
11 hours ago
GenoMax 153k

It is an investment in time and will ensure that there is no extraneous sequence is present, if you choose to do scanning/trimming.

That said aligners should soft-clip the adapter sequence at the time of alignment, so technically you do not need to trim the data.

Note: Those sequences that show adapter starting at cycle 1 are likely all primer dimers and have no inserts.

ADD COMMENT

Login before adding your answer.

Traffic: 4059 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6