Hi everyone, I’m improving my adapter-trimming pipeline and want to confirm if my current setup makes sense.
Current command:
cutadapt \
-a AGATCGGAAGAG -a AAAAAAAAAAAA -a GGGGGGGGGGGG \
-A AGATCGGAAGAG -A AAAAAAAAAAAA -A GGGGGGGGGGGG \
-j 128 -m 5 -Q 20 -q 20 -o R1.trim.fq.gz -p R2.trim.fq.gz R1.fq.gz R2.fq.gz
I include poly-A and poly-G because FastQC reports them. My thought is: if they aren’t real adapters, paired-end alignment (5' ends were not touched) should still recover any useful sequence. I used to think fastp cannot clean all fastQC see, such as AAAAAAAAAAAA, but now I think it might be cause of different target length of A in fastp. Besides, fastp can not parallel over 16 threads.
Questions:
- Is it safe to include AAAAAAAAAAAA / GGGGGGGGGGGG, or does that risk over-trimming real poly(A) tails?
- should I move to fastp with automatic adapter detection? Is it necessary to change cutadapt to fastp?
Example fastp call for comparison:
fastp -i R1.fq.gz -I R2.fq.gz -o R1.trim.fq.gz -O R2.trim.fq.gz \
--detect_adapter_for_pe
Thanks for any advice!