I am trying to use cutadapt for the first time to trim 300 paired-end fastq files generated using 16S rRNA gene amplicon sequencing of the V3-V4 regions of the 16S rRNA gene. I am doing this analysis on High performance computing cluster.
The V3-V4 regions of the 16S rRNA gene were amplified using a mixture of the universal bacterial primers 341F1–4 (5′ CCTACGGGNGGCWGCAG 3′) and 785R1–4 (5′ GACTACHVGGGTATCTAATCC 3′) with partial Illumina TruSeq adapter sequences added to the 5′ ends (F1; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT, F2; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTgt, F3; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTagag, F4; ATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTtagtgt and R1; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT, R2; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTa, R3; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtct, R4; GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTctgagtg).
So, I followed the cutadapt tutorial ( https://cutadapt.readthedocs.io/en/stable/guide.html#paired-end ) and used the following code from the tutorial to trim paired-end fastq reads:
cutadapt -a ADAPTER_FWD -A ADAPTER_REV -o out.1.fastq -p out.2.fastq reads.1.fastq reads.2.fastq
I made a sbatch file with cutadapt code for every 300 samples by putting the forward universal bacterial primer in the -a ADAPTER_FWD
section and the reverse universal bacterial primer in the -A ADAPTER_REV
section of the code. An example code of what I did is below:
cutadapt -a CCTACGGGNGGCWGCAG -A GACTACHVGGGTATCTAATCC -o sample1_trimmed_1.fastq -p sample1_trimmed_2.fastq sample1_1.fastq sample1_2.fastq
cutadapt -a CCTACGGGNGGCWGCAG -A GACTACHVGGGTATCTAATCC -o sample2_trimmed_1.fastq -p sample2_trimmed_2.fastq sample2_1.fastq sample2_2.fastq
..
..
..
..
# upto 300 samples.
Question 1) Am I correctly using cutadapt?, because I couldn't understand where should I insert the four partial Illumina TruSeq adapter sequences (F1, F2, F3, F4 & R1, R2, R3, R4) in the above code, I only used the forward & reverse universal bacterial primers to trim.
Question 2) How do I create a loop of sample names in the cutadapt code, so that I don't have to write the cutadapt code line 300 times in my sbatch script?
Thank you
It looks like you deleted an older question with the exact same content and added a new one. Please don't do that, request a bump from a moderator instead.
Also, please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or use one of (a) the option highlighted in the image below/ (b) fenced code blocks for multi-line code. Fenced code blocks are useful in syntax highlighting. If your code has long lines with a single command, break those lines into multiple lines with proper escape sequences so they're easier to read and still run when copy-pasted. I've done it for you this time.