I'm trying to trim/filter low quality reads from paired-end exome-seq data, using BBDuk.
I used the command:
for ea in $files; do R1="$ea" R2=$(echo $R1 | sed "s/R1/R2/") /home/shared/programs/bbmap/bbduk.sh -Xmx1g in1=$R1 in2=$R2 \ out1="$(echo $ea | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \ out2="$(echo $(echo $ea | sed s/R1/R2/) | sed s/.fastq.gz/_trimmed_filtered.fastq.gz/)" \ ref=/home/shared/programs/bbmap/resources/adapters.fa \ t=10 ktrim=r k=23 kmin=11 hdist=1 maq=10 minlen=60 tpe tbo done;
After running fastqc on the output of this, I'm seeing that R2 files have some reads with low quality scores (see per sequence quality score), and the overrepresented sequence "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN", which should have been filtered out by quality filtering, no?
Any help here would be much appreciated.