Need help in using cutadapt to trim paired end fastq reads
1
1
Entering edit mode
7 weeks ago
Akshay • 0

I am trying to use cutadapt for the first time to trim 300 paired-end fastq files generated using 16S rRNA gene amplicon sequencing of the V3-V4 regions of the 16S rRNA gene. I am doing this analysis on High performance computing cluster.

The V3-V4 regions of the 16S rRNA gene were amplified using a mixture of the universal bacterial primers 341F1–4 (5′ CCTACGGGNGGCWGCAG 3′) and 785R1–4 (5′ GACTACHVGGGTATCTAATCC 3′) with partial Illumina TruSeq adapter sequences added to the 5′ ends (F1; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, F2; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTgt, F3; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTagag, F4; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTtagtgt and R1; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT, R2; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTa, R3; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtct, R4; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTctgagtg).

So, I followed the cutadapt tutorial ( https://cutadapt.readthedocs.io/en/stable/guide.html#paired-end ) and used the following code from the tutorial to trim paired-end fastq reads:

cutadapt -a ADAPTER_FWD -A ADAPTER_REV -o out.1.fastq -p out.2.fastq reads.1.fastq reads.2.fastq

I made a sbatch file with cutadapt code for every 300 samples by putting the forward universal bacterial primer in the -a ADAPTER_FWD section and the reverse universal bacterial primer in the -A ADAPTER_REV section of the code. An example code of what I did is below:

cutadapt -a CCTACGGGNGGCWGCAG -A GACTACHVGGGTATCTAATCC -o sample1_trimmed_1.fastq -p sample1_trimmed_2.fastq sample1_1.fastq sample1_2.fastq
cutadapt -a CCTACGGGNGGCWGCAG -A GACTACHVGGGTATCTAATCC -o sample2_trimmed_1.fastq -p sample2_trimmed_2.fastq sample2_1.fastq sample2_2.fastq
..
..
..
..
# upto 300 samples.

Question 1) Am I correctly using cutadapt?, because I couldn't understand where should I insert the four partial Illumina TruSeq adapter sequences (F1, F2, F3, F4 & R1, R2, R3, R4) in the above code, I only used the forward & reverse universal bacterial primers to trim.

Question 2) How do I create a loop of sample names in the cutadapt code, so that I don't have to write the cutadapt code line 300 times in my sbatch script?

Thank you

fastq primer paried-end cutadapt • 280 views
ADD COMMENT
1
Entering edit mode

It looks like you deleted an older question with the exact same content and added a new one. Please don't do that, request a bump from a moderator instead.

Also, please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.
code_formatting

ADD REPLY
1
Entering edit mode
7 weeks ago
GenoMax 141k

You can see that the adapters share a core sequence

ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTgt
ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTagag

so as long as you use ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT it will cover all variations of forward adapter. Same for the reverse. Once this core sequence is found scan/trim programs are going to remove all sequence on 3'-end.

You had already found the thread where I had provided an example of how you would use a loop to submit multiple jobs.

ADD COMMENT

Login before adding your answer.

Traffic: 1855 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6