Question

Is it ok to trim assembled paired-end fastq files with cutadapt 3.4 ?

0

Entering edit mode

13 months ago

Bengisu • 0

Hi,

I don't know much about bioinformatics and R language, I just know how to use some bioinformatic tools in a simple way. That´s why I prefer to follow simple tools to analyze my Illumina seqs as I can.

I wonder if it is suitable to apply QC steps in assembled paired-end fastq files. I´m using cutadapt 3.4 and as you know cutadapt 3.4 supports trimming of paired-end reads with basic command line syntax:

cutadapt -a ADAPTER_FWD -A ADAPTER_REV -o out.1.fastq -p out.2.fastq reads.1.fastq reads.2.fastq

After trimming process, I´m using FastQC tool to check the quality. So, I prefer to work assembled fastq files into one fastq rather than two. But I have some concerns about this step.

Do you have any recommendation or idea about this issue? And if you recommend me to work with two fastq pairs during whole QC step, do you know how I can check these pairs with FastQC?

Thank you so much for your help!

QC cutadapt Illumina trimming 3.4 • 526 views

ADD COMMENT • link updated 13 months ago by GenoMax 141k • written 13 months ago by Bengisu • 0

1

Entering edit mode

So, I prefer to work assembled fastq files into one fastq rather than two.

What does this mean? Are you concatenating two sets of files?

One needs to work with paired-end sequence files together since if you remove one sequence from one file its corresponding mate needs to go from the other file to keep the order of reads in sync. You should have two inputs going in to the tool (R1 and R2) and two coming out (trimmed). Sometime tools will also allow you to collect reads that become singletons (after its corresponding mate is discarded). Here is an alternate tool that explains this: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/

ADD REPLY • link 13 months ago by GenoMax 141k

0

Entering edit mode

Thank you for sharing. I´m concatenating two sets of files before trimming to remove sequences from both of them. But, I don´t know if it is correct or not. I´ll take look the link you shared. Thanks.

ADD REPLY • link 13 months ago by Bengisu • 0

0

Entering edit mode

I´m concatenating two sets of files before trimming to remove sequences from both of them

That only makes sense if it is the same sample that ran on multiple lanes or flowcells. When you concatenate it is important to use the same order for individual pieces. e.g.

cat Sample_L001_R1.fq.gz Sample_L002_R1.fq.gz > Sample_one_file_R1.fq.gz
cat Sample_L001_R2.fq.gz Sample_L002_R2.fq.gz > Sample_one_file_R2.fq.gz

You can then use Sample_one_file_R1/R2 files as input for cutadapt as shown here: https://cutadapt.readthedocs.io/en/stable/guide.html#paired-end.

ADD REPLY • link 13 months ago by GenoMax 141k