Hi,
I don't know much about bioinformatics and R language, I just know how to use some bioinformatic tools in a simple way. That´s why I prefer to follow simple tools to analyze my Illumina seqs as I can.
I wonder if it is suitable to apply QC steps in assembled paired-end fastq files. I´m using cutadapt 3.4 and as you know cutadapt 3.4 supports trimming of paired-end reads with basic command line syntax:
cutadapt -a ADAPTER_FWD -A ADAPTER_REV -o out.1.fastq -p out.2.fastq reads.1.fastq reads.2.fastq
After trimming process, I´m using FastQC tool to check the quality. So, I prefer to work assembled fastq files into one fastq rather than two. But I have some concerns about this step.
Do you have any recommendation or idea about this issue? And if you recommend me to work with two fastq pairs during whole QC step, do you know how I can check these pairs with FastQC?
Thank you so much for your help!
What does this mean? Are you concatenating two sets of files?
One needs to work with paired-end sequence files together since if you remove one sequence from one file its corresponding mate needs to go from the other file to keep the order of reads in sync. You should have two inputs going in to the tool (R1 and R2) and two coming out (trimmed). Sometime tools will also allow you to collect reads that become singletons (after its corresponding mate is discarded). Here is an alternate tool that explains this: https://jgi.doe.gov/data-and-tools/software-tools/bbtools/bb-tools-user-guide/bbduk-guide/
Thank you for sharing. I´m concatenating two sets of files before trimming to remove sequences from both of them. But, I don´t know if it is correct or not. I´ll take look the link you shared. Thanks.
That only makes sense if it is the same sample that ran on multiple lanes or flowcells. When you concatenate it is important to use the same order for individual pieces. e.g.
You can then use Sample_one_file_R1/R2 files as input for
cutadapt
as shown here: https://cutadapt.readthedocs.io/en/stable/guide.html#paired-end.