Question

Having multiple adapter sequence in raw fastq files?

0

Entering edit mode

7.0 years ago

mirzaei86.vahid ▴ 50

Dear all,

I have 50 RNA-Seq samples for a project. FastQC reports different TruSeq Adapter, Indexes (mostly 2, 7,5 and 14) for each of them and also some of them might have llumina Single End PCR Primer 1.

I see very good quality score (>30 for most of reads) but there were fluctuation in K-mer part (just like attached image), between bases 42-50. TruSeq Illumina indexes have 65 bases length however I found only 52 bases in reads. I have 2 question :

1- Should I treat each file bases on their overrepresented sequences? 2- How can I trim TruSeq-Indexes ( just those 52 bases)?

Thanks, Vahid

enter image description here

RNA-Seq next-gen • 2.3k views

ADD COMMENT • link 7.0 years ago by mirzaei86.vahid ▴ 50

score 1 · Accepted Answer · 2017-05-01

1

Entering edit mode

7.0 years ago

GenoMax 141k

Use bbduk.sh from BBMap suite. BBMap includes all common adapters in a adapters.fa file that you will find in the resources directory when you download the software. It is ok to scan all of them at the same time against your data.

ADD COMMENT • link 7.0 years ago by GenoMax 141k

0

Entering edit mode

thanks @genomax2, how can I crop first 13 bases from start of reads??

ADD REPLY • link 7.0 years ago by mirzaei86.vahid ▴ 50

1

Entering edit mode

If this is RNAseq data then don't. See this post for an explanation.

If you still want to then check ftl= option for bbduk.sh.

ADD REPLY • link 7.0 years ago by GenoMax 141k

0

Entering edit mode

thanks for your helps.

ADD REPLY • link 7.0 years ago by mirzaei86.vahid ▴ 50