Issue while running Kenddata.
0
0
Entering edit mode
4 days ago

I was running 15 samples for host decontamination using Kenddata C57 mice. I found that all the folders contain paired fastqc files for taxonomic profiling, except for two samples. No errors appeared, and I made sure that no typos were in the folder name and that the reading sequences for the forward and reverse reads were the same. Could anyone help me solve this issue? I attached an image after the decontamination. It shows no paired fatsq like other folders. This image is for a sample that contains paired fastq after decontamination

paired Kenddata fastq • 776 views
ADD COMMENT
0
Entering edit mode

Look inside the *.log files to see if you can get additional clues (and post them here, if you can't figure out things). Check for lines that say error/warning, things to that effect.

ADD REPLY
0
Entering edit mode

No error is indicated in the log file, but I noticed some files that are not found in the files with paired fastq and also the decontamination in the two files with no paired fastq are 0 kb.

The different files found in the incomplete folder and it also contains decontamination files with 0 kb A complete folder Compare 1 and 2 and see the difference.

ADD REPLY
0
Entering edit mode

Looking at the odd file names (indicative of temporary files) it is possible that your job is getting killed partway, though there should be some indication of that in log e.g. abrupt ending without any clear messages.

Hopefully someone will be along who knows about kneaddata with additional help. Please add information about the exact command line you are using.

ADD REPLY
0
Entering edit mode

The code for running kneaddata

ADD REPLY
0
Entering edit mode

Please don't use images to post text content. You can copy and post the text and then format it as code using the 101010 button in edit window.

ADD REPLY
0
Entering edit mode
 #!/bin/bash

INPUT_DIR=/mnt/d/metashotgun/fastqc-files/fq.gz
OUTPUT_DIR=/mnt/d/metashotgun/fastqc-files/kneaddata-output
REF_DB=/mnt/d/metashotgun/host_decontamination/kneaddata_c57mouse_db
THREADS=8

mkdir -p "$OUTPUT_DIR/logs"

for file1 in ${INPUT_DIR}/*_1.fq.gz; do
    sample=$(basename "$file1" | sed 's/_1\.fq\.gz//')
    file2="${INPUT_DIR}/${sample}_2.fq.gz"

    echo "Processing $sample ..."
    mkdir -p "$OUTPUT_DIR/$sample"

    kneaddata \
        --input1 "$file1" \
        --input2 "$file2" \
        --output "$OUTPUT_DIR/$sample" \
        --reference-db "$REF_DB" \
        --threads "$THREADS"

    echo "Finished $sample (log: logs/${sample}.log)"
    echo "-----------------------------------------"
done
ADD REPLY
0
Entering edit mode

Are you using a job scheduler on a cluster or running this on the command line. If the job ends abruptly it is most likely running out of memory. Something you can watch/check for.

You also made a logs directory mkdir -p "$OUTPUT_DIR/logs". Is there anything there?

ADD REPLY
0
Entering edit mode

I checked the memory before executing this command. I made a bash script on Ubuntu to run it on 15 folders. But, this problem happened in folder number 6 and 8 only not the last two.

ADD REPLY

Login before adding your answer.

Traffic: 3848 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6