Hi, apologies for this basic question as I am new to the field. I have been checking my NGS data using FastQC and the checks fail on the Kmer content section. There appears to be a sequence TAGATCGGAA at position 90-100 bp in the reads which is enriched around 12 fold (Obs/Exp Max). However nothing shows up in the 'Overrepresented sequences' or 'Adapter content' sections of the report (complete flat line). The sequencing was done as BGI and I do not know what primers were used. My question is should I be trying to remove this Kmer sequence?
If I use grep on the first million reads in my fastq file to look for the sequence I only find 36 which appears to be quite a low number?:
$ gunzip -c data1.fq.gz | head -4000000 | grep TAGATCGGAA | wc -l
Thanks for the help