4.5 years ago
Varun Gupta ★ 1.2k

Hi I am trying to trim my fastq files where I have reads ranging from 20bp to 150bp as shown by fastqc result. I want to keep a definitive length of my reads to 120bp. So if any read is greater than 120 bp trim it to 120 and if any read is less than 120 then discard it. How can I do this using cutadapt?

I can use -m option to throw away reads less than 120 bp but I still get reads greater than 120 bp. How can I trim them to 120 bp.

This is the command I am using

cutadapt -m 120 -o file1.trimmed.fq -p file2.trimmed.fq file1.fq file2.fq

Also can trim galore do that?

Thanks

If your goal is just to trim the reads, why do you want to use cutadapt ? You could use something like fastq_trimmer

I was using cutadapt so thought this would be possible using it. I will look into fastq_trimmer. Thanks

Since no one has asked this I will. Why do you want to do this? Unless the reads have really poor Q-scores (< Q10) or show presence of adapters, there should be little reason to trim them. Far too many people take "failures" of FastQC modules seriously.

Hi genomax2, The reads are fine as far as quality scores are concerned. The reason I want to do this is I am using rMATS downstream which as of now requires fixed length of the reads.

4.5 years ago
biofalconch ▴ 470

Trimmomatic can achieve this one nice and easy. Just use the parameter CROP, which uses a number of bases you want to keep.

I will try this. Thanks

4.5 years ago
Satyajeet Khare ★ 1.6k

Hi Varun,

I do this in two separate steps. I first remove the adapters and run fastqc and then depending on the read quality of the new file, I trim them to required size using fastx_trimmer. The reason being that removal of adapter changes sequence quality on fastqc. So parameters for trimming change.

Best,

4.2 years ago
elgart ▴ 10

cutdapt -l 120 -m 120 -o file1.trimmed.fq -p file2.trimmed.fq file1.fq file2.fq

4.5 years ago

If you want to stick to cutadapt I guess you could use the -m option (Discard trimmed reads that are shorter than LENGTH) together with the -M option (Discard trimmed reads that are longer than LENGTH).

-M option would discard reads longer than 120 bp. I want to trim it if they are longer than 120 to 120bp