Complex adaptor trimming
1
0
Entering edit mode
2.0 years ago
linc1464 • 0

Hello everyone,

Bioinformatics novice, here looking for help. I'm using Galaxy to try and remove adaptors from sequencing reads but it's tricky and I would like some advice on approach. Here's the experiment.

50bp PE reads. The 5' end of read 1 contains adaptor then 3x G. The 5' end of read 2 contains 15x T derived from polyadenylation during the library prep. I would like to trim the G's off the 5' end of read 1 and the T's from the 5' end of read 2. In addition, for any reads shorter than 50 bp, the 3' end of read 2 will contain 3x C (complement of the 3xG) and the 3' end of read 1 will have 15x A (the complement of the T's). Is there an additional trick to remove these instances too?

Thanks for any help!

sequencing Trimming galore trim • 967 views
ADD COMMENT
0
Entering edit mode

It always helps to post data instead of explaining the problem. Post some example reads and expected output. It can be done via CLI. Similar (for eg cutadapt, bbduk) tools are available in galaxy.

ADD REPLY
0
Entering edit mode

So, reads will take the following format:

Read 1
5' ADAPTOR - GGG - then the mapped bit I want (size ~20-100 nts) - AAAAAAAAAAAAAAA-ADAPTOR 3'

Read 2
5' ADAPTOR - TTTTTTTTTTTTTTT - then the mapped bit I want (size ~20-100 nts) - CCC - ADAPTOR 3'

I have 50 bp paired-end reads and want to remove the ADAPTOR - GGG from the start and the AAAAAAAAAAAAAAA - ADAPTOR from the 3' end to leave the bit in the middle. Unfortunately, I'm only able to use Galaxy (have very limited programming knowledge).

ADD REPLY
0
Entering edit mode

Is this paired-end data? And do you have 2 FASTQ files (R1 and R2)? In that case you can upload both files to galaxy and use cutadapt or fastp on paired-end mode. I think people here need that info to be able to give a good answer.

ADD REPLY
0
Entering edit mode

Thank you. Yes, I have two files per sample (read 1 and read 2).

ADD REPLY
0
Entering edit mode

Is there real sequence in your read where you have added the word ADAPTOR above?

ADD REPLY
0
Entering edit mode

You can use bbduk.sh from BBMap suite in two pass mode like this on the command line.

$ more test.fq
@M12345:751:000000000-F345F:1:1101:18044:1642 1:N:0:GATCTATC+ATGAGGCT
CGGTTCATCTCAGAGATCTCATGCTTGGTGTTGCGGAGGTCATCGCCATG
+
ABBAABFFFFFFGGCGGGGGGGHHHHHGFFHGHHGGGGGEGFFHHGGGGE
@M12345:751:000000000-F345F:1:1101:17624:1642 1:N:0:GATCTATC+ATGAGGCT
ACTGACTGACTGGGGCTCCAATTATGCCACCAGCCACCAGGCCACGCAGGCCTACGTTTATCCTAAAAAAAAAAAAA
+
AAAAAAAAAAAAAAAABAABFFFFFFFGGGGGGGGGGHHHGHGHGGGGGGGGHHHGHHHHGHHHHHHHHHHHHHHHH
@M12345:751:000000000-F345F:1:1101:16214:1642 1:N:0:GATCTATC+ATGAGGCT
CCAGCTTTATTGAAACCTATTACAGAAGACAATCCAAATAAAACCACTGT
+
AAAAAFFFFFFFGGGGGGGGGGHHHGHHHHHHHHHHHHHHGHHHGHGHHH
@M12345:751:000000000-F345F:1:1101:15835:1659 1:N:0:GATCTATC+ATGAGGCT
ACTGACTGACTGGGCCTTGGGTGGTTCAGTCAAAGAGGTAAGACCTCCAGCTGGCTCACAAGAGAAAAAAAAAAAA
+
BBBBBBBBBBBBBBBBBBAFA3ADBAGGGGGGGGGGHGGFG4EGHHHGHCHHCHGHHHHHHHGHHHHHHHHHHHHH

With the command

bbduk.sh -Xmx2g in=test.fq out=stdout.fq literal=ACTGACTGACTGGG ktrim=l k=10 | bbduk.sh -Xmx2g in=stdin.fq out=stdout.fq literal=AAAAAAA ktrim=r k=6 int=f

This will produce

@M12345:751:000000000-F345F:1:1101:18044:1642 1:N:0:GATCTATC+ATGAGGCT
CGGTTCATCTCAGAGATCTCATGCTTGGTGTTGCGGAGGTCATCGCCATG
+
ABBAABFFFFFFGGCGGGGGGGHHHHHGFFHGHHGGGGGEGFFHHGGGGE
@M12345:751:000000000-F345F:1:1101:17624:1642 1:N:0:GATCTATC+ATGAGGCT
GCTCCAATTATGCCACCAGCCACCAGGCCACGCAGGCCTACGTTTATCCT
+
AABAABFFFFFFFGGGGGGGGGGHHHGHGHGGGGGGGGHHHGHHHHGHHH
@M12345:751:000000000-F345F:1:1101:16214:1642 1:N:0:GATCTATC+ATGAGGCT
CCAGCTTTATTGAAACCTATTACAGAAGACAATCC
+
AAAAAFFFFFFFGGGGGGGGGGHHHGHHHHHHHHH
@M12345:751:000000000-F345F:1:1101:15835:1659 1:N:0:GATCTATC+ATGAGGCT
CCTTGGGTGGTTCAGTCAAAGAGGTAAGACCTCCAGCTGGCTCACAAGAG
+
BBBBAFA3ADBAGGGGGGGGGGHGGFG4EGHHHGHCHHCHGHHHHHHHGH
ADD REPLY
0
Entering edit mode
2.0 years ago
gb ★ 2.2k

In short in galaxy you just:

  1. upload both files
  2. Open the cutadapt tool from the tool menu
  3. select "paired-end" as first option
  4. For "FASTQ/A file #1" you select your read1 file
  5. For "FASTQ/A file #2" you select your read2 file
  6. Fil in the read1 and read2 adapters
  7. execute the tool
ADD COMMENT

Login before adding your answer.

Traffic: 1956 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6