Identify adapter sequences for trimming from Illumina paired end fastq files
2
1
Entering edit mode
5.8 years ago

Hi,

I am working with the Illumina paired end unaligned data. I would like to initially identify the adapter sequences present in the data, and trim the reads accordingly. Is there are a way to identify the adapter sequences. Please assist me with this and let me know the tools to use.

Thank you, Toufiq

RNA-Seq Adapter trimming QC Fastq • 10k views
ADD COMMENT
4
Entering edit mode
5.8 years ago
GenoMax 147k

Use BBMap suite (reproduced from here) :

If you have paired reads, and enough of the reads have inserts shorter than read length, you can identify adapter sequences with BBMerge, like this (they will be printed to adapters.fa):

bbmerge.sh in1=r1.fq in2=r2.fq outa=adapters.fa

You can find the adapter sequence used in the adapters.fa file included with BBMap. In that case, you can do this:

bbduk.sh in1=r1.fq in2=r2.fq k=23 ref=adapters.fa stats=stats.txt

stats.txt will then list the names of adapter sequences found, and their frequency.

ADD COMMENT
0
Entering edit mode

Thank you. I was able to identify the adapters in R1.fq and R2.fq. Now, I would like to know if these are 5' forward/reverse or 3' forward/reverse. Is there are way to identify.

ADD REPLY
0
Entering edit mode
5.8 years ago

fastp is a new tool that is almost as fast as bbduk but has implemented methods that automatically detect 5' or 3' adapters for both paired (must be manually enabled) and single-end data.

the adapters are evaluated by analyzing the tails of first ~1M reads

So if you have more complicated or multiple adapters this may not be ideal.

ADD COMMENT
0
Entering edit mode

Thank you. I ran this program, however, did not find any specific adapter.

./fastp -i <input1> -I <input2> -o R1.fastq.gz -O R2.fastq.gz --disable_adapter_trimming --detect_adapter_for_pe --html Report_sample.html

In the .html file, this only reports Duplication rate Insert size estimaion Before/after filtering read quality Before/after filtering base content Before/after kmer counting

ADD REPLY
1
Entering edit mode

If you use "--disable_adapter_trimming" then it does not search for adapters...

ADD REPLY
0
Entering edit mode

Thank you. Another question, is it recommended to trim the adapters for the Illumina Paired end data with 150*2 bp

ADD REPLY
2
Entering edit mode

If they are present they should be trimmed especially if you are going to do any de novo work with your data.

ADD REPLY

Login before adding your answer.

Traffic: 1657 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6