Having multiple adapter sequence in raw fastq files?
4.7 years ago

Dear all,

I have 50 RNA-Seq samples for a project. FastQC reports different TruSeq Adapter, Indexes (mostly 2, 7,5 and 14) for each of them and also some of them might have llumina Single End PCR Primer 1.

I see very good quality score (>30 for most of reads) but there were fluctuation in K-mer part (just like attached image), between bases 42-50. TruSeq Illumina indexes have 65 bases length however I found only 52 bases in reads. I have 2 question :

1- Should I treat each file bases on their overrepresented sequences? 2- How can I trim TruSeq-Indexes ( just those 52 bases)?

Thanks, Vahid

RNA-Seq next-gen
4.7 years ago
GenoMax

Use bbduk.sh from BBMap suite. BBMap includes all common adapters in a adapters.fa file that you will find in the resources directory when you download the software. It is ok to scan all of them at the same time against your data.

thanks @genomax2, how can I crop first 13 bases from start of reads??

If this is RNAseq data then don't. See this post for an explanation.

If you still want to then check ftl= option for bbduk.sh.

