Trimmomatic running but files containing purged reads are empty
1
0
Entering edit mode
24 days ago
Wilber0x ▴ 50

I am running a script on paired end illumina data to run trimmomatic to remove adaptor sequences and low quality reads.

The script is below:

input_dir="/home/reads"
# Define the output directory
output_dir="trimmed"
# Create the output directory if it doesn't exist
mkdir -p "$output_dir"

# Loop through all files in the input directory that match the pattern *_1.fastq.gz
for file1 in "$input_dir"/*_1.fastq.gz; do
    # Extract the base name (two-letter code) from the file name
    base_name=$(basename "$file1" _1.fastq.gz)
    file2="${input_dir}/${base_name}_2.fastq.gz"

    # Check if the corresponding _2.fastq.gz file exists
    if [[ -f "$file2" ]]; then
        # Run the trimmomatic command
        trimmomatic PE "$file1" "$file2" \
            -baseout "${output_dir}/${base_name}.fastq.gz" \
            ILLUMINACLIP:adaptors.fasta:4:30:10 MINLEN:30
    else
        echo "Warning: Corresponding file for $file1 not found. Skipping."
    fi
done

The script runs over several hours on my cluster, but when I check the output files that should contain the reads removed by trimmomatic they are empty. I have also run fastqc on the sequence files before and after trimmomatic, and the html report files are identical. I can see from the fastqc outputs that each file fails for "Adapter Content" and contains a high percentage of Illumina Universal Adapter sequences.

Here is the sequence I am using for the Illumina Universal Adapter

>IlluminaUniversalAdapterNA
AGATCGGAAGAG

Why is trimmomatic not removing any reads?

fastqc fastq trimmomatic • 288 views
ADD COMMENT
1
Entering edit mode
24 days ago
GenoMax 143k

Why is trimmomatic not removing any reads?

It is not mandatory that your data have extraneous/adapter sequence. If no extraneous sequence is present then no reads will be trimmed/removed. That said check to make sure that you are providing correct adapter sequences and that file is readable/accessible.

Run a pair of files manually to make sure that the job is running correctly with the options you are providing. Check the log files to see if you see anything odd.

ADD COMMENT
0
Entering edit mode

Thanks for the advice, it seems like it is likely I have the incorrect adaptor sequences, though I am still surprised that no low quality reads were removed.

ADD REPLY
0
Entering edit mode

Perhaps there were no low quality reads either.

ADD REPLY

Login before adding your answer.

Traffic: 1249 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6