I have single-end smallRNAseq data and I have to trim the adapter (adapter sequence:
AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC). After trimming the adapter using the bbduk command mentioned below, I checked the reads in my data and they still have the overhangs of the adapter as shown below.
The adapter overhangs can be seen below:
cat bbduk_trimmed_small_RNA_001.fastq | grep TCGCAGGGAAATCATCTGATTA
or can be seen here: https://postimg.cc/image/knuvtyv0d/
This trimming was done using bbduk.sh (bbmap tools: https://jgi.doe.gov/data-and-tools/bbtools/bb-tools-user-guide/bbduk-guide/). Here NEB-SE.fa file has the adapter sequence mentioned above.
bbduk.sh -Xmx1g in=small_RNA_001.fastq out=bbduk_trimmed_small_RNA_001.fastq ref=NEB-SE.fa ktrim=r k=13 mink=6 minlength=18 hdist=0
I could reduce the kmer size
k=4, but that would risk into trimming the false positives. How can I completely get rid of these adapter sequences from my data without trimming false positives?