How to remove multiple adapter sequences by Cutadapt
1
0
Entering edit mode
3.1 years ago

Currently I have two sequences(adapters) need to be remove. I used Cutadapt --time 2 option to run 2 rounds of trimming. However, from the Cutadapt documentation, I saw it mentions a drawback in the session of Recipes and FAQ/Remove more than one adapter :

The problem is that it could happen that one adapter is found twice

I come up with an example here: Assuming that we have to remove ADAPTER and NOTNEED in the following raw data.

sequenceAADAPTERDAPTERsequenceNOTNEEDsequence

I used the following command two perform two runs:

cutadapt -b ADAPTER -b NOTNEED -n 2 -o trim_sequence.fastq raw_data.fastq

So I believe first run is to remove ADAPTER, and the second run is to remove NOTNEED

My question is: do the bases shift after trimming the ADAPTER in the first run? If it does, the cutadapt would detect ADAPTER again when it processes the second run of trimming NOTNEED. How do I safely remove multiple adapters?

RNA-seq cutadapt adapter multipleadapters • 2.4k views
ADD COMMENT
2
Entering edit mode

Normally trimming programs will remove entire sequence to the right of where the adapter is found so only the first adapter sequence would be needed. At least that is how bbduk.sh from BBMap suite will work in ktrim=r mode. A guide is available. You can also specify multiple adapter sequences on command line with literal=seq1,seq2,seq3.. when trimming.

ADD REPLY
1
Entering edit mode

it helps to post some example data and expected output instead of problem description. Following is an dummy data (doesn't exist):

input sequence:

$ cat test.fa                                                                                                                                                                        
>seq
AATTTT`GTGTGT`ATGCG`GTGTGT`ATGACGTCGAT`GAGAGA`GCTCT

with multiple rounds of trimming:

$ cutadapt -g GTGTGT -a GAGAGA -n 1 test.fa --quiet                                                                                                                                  

>seq
ATGCGGTGTGTATGACGTCGATGAGAGAGCTCT

$ cutadapt -g GTGTGT -a GAGAGA -n 2 test.fa --quiet                                                                                                                                  

>seq
ATGACGTCGATGAGAGAGCTCT

$ cutadapt -g GTGTGT -a GAGAGA -n 3 test.fa --quiet                                                                                                                                  
>seq
ATGACGTCGAT

$ cutadapt -g GTGTGT -a GAGAGA -n 4 test.fa --quiet                                                                                                                                  
>seq
ATGACGTCGAT

-n 4 was not necessary. Just to show that any number beyond number of adapters do not change the output

But you can also use linked adapter as follows:

$ cutadapt -a GTGTGT...GAGAGA --times 2 test.fa --quiet                                                                                                                              

>seq
ATGACGTCGAT

Coming to your queries:

do the bases shift after trimming the ADAPTER in the first run? 

Do not know what you mean by shift here. If regular 5' adapter is provided, all the upstream sequence {wrt 5' adapter) including adapter is trimmed. If regular 3' adapter is provided, all down stream sequence (wrt 3' adapter) is trimmed. No base repositioning happens, it's only trimming upstread and downstream trimming depending on the options furnished.

If it does, the cutadapt would detect ADAPTER again when it processes the second run of trimming NOTNEED.

Order of execution (trimming in this case) is the order of appearance of options (-a, -g etc). In each round (time), only first found adapter is removed (as in the order of appearance)

How do I safely remove multiple adapters?

There are multiple ways and multiple tools. Unless you provide example input and expected output, it is difficult to say.

ADD REPLY
1
Entering edit mode
3.1 years ago

Adapter trimming does not mean getting inside the read and deleting bases from it, it means removing the entire right side. It does not matter which one is first. If you have:

---------AAA------BBB>

if you trim the sequence for AAA you get will end up with

-------->

if you trim the sequence for BBB you will end up with:

---------AAA------>

Note how trimming for AAA removed BBB as well.

The rationale for being able to list multiple adapters are a little tricky and has to do with paired adapters (5' and 3') - as the manual explains and it is better to run the tool twice rather than misunderstand what it does and using it wrong.

ADD COMMENT

Login before adding your answer.

Traffic: 2580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6