Having multiple adapter sequence in raw fastq files?
1
0
Entering edit mode
7.0 years ago

Dear all,

I have 50 RNA-Seq samples for a project. FastQC reports different TruSeq Adapter, Indexes (mostly 2, 7,5 and 14) for each of them and also some of them might have llumina Single End PCR Primer 1.

I see very good quality score (>30 for most of reads) but there were fluctuation in K-mer part (just like attached image), between bases 42-50. TruSeq Illumina indexes have 65 bases length however I found only 52 bases in reads. I have 2 question :

1- Should I treat each file bases on their overrepresented sequences? 2- How can I trim TruSeq-Indexes ( just those 52 bases)?

Thanks, Vahid

enter image description here

RNA-Seq next-gen • 2.3k views
ADD COMMENT
1
Entering edit mode
7.0 years ago
GenoMax 141k

Use bbduk.sh from BBMap suite. BBMap includes all common adapters in a adapters.fa file that you will find in the resources directory when you download the software. It is ok to scan all of them at the same time against your data.

ADD COMMENT
0
Entering edit mode

thanks @genomax2, how can I crop first 13 bases from start of reads??

ADD REPLY
1
Entering edit mode

If this is RNAseq data then don't. See this post for an explanation.

If you still want to then check ftl= option for bbduk.sh.

ADD REPLY
0
Entering edit mode

thanks for your helps.

ADD REPLY

Login before adding your answer.

Traffic: 1462 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6