After browsing similar questions and trying to use the "friendly" tools available, I concluded that adapter removing is not trivial at all for non-expert users. At least not for some datasets. So I have a few questions, If you could help me with any of those it would be really nice.
- How do I know what adapters are present in my reads? (Fastqc report shows several hits with Illumina Multiplexing PCR primer 2.0.1, but clipping it's sequence won't clean all reads and reports will keep showing this contamination). Shouldn't I know the adapter just by knowing the library prep kit used?
- Why don't all reads have adapters?
- If I use Cutadapt with the first 13bp of Illumina universal adapter (AGATCGGAAGAGC) over half of my dataset is lost in clipping (20Gb to 9Gb). Also, Fastqc will still show adapter contamination. Can I trust this clipping?