Question

Issue while running Kenddata.

0

Entering edit mode

4 days ago

rana.elromh • 0

I was running 15 samples for host decontamination using Kenddata C57 mice. I found that all the folders contain paired fastqc files for taxonomic profiling, except for two samples. No errors appeared, and I made sure that no typos were in the folder name and that the reading sequences for the forward and reverse reads were the same. Could anyone help me solve this issue? I attached an image after the decontamination. It shows no paired fatsq like other folders. This image is for a sample that contains paired fastq after decontamination

paired Kenddata fastq • 776 views

ADD COMMENT • link 15 hours ago by rana.elromh • 0

0

Entering edit mode

Look inside the *.log files to see if you can get additional clues (and post them here, if you can't figure out things). Check for lines that say error/warning, things to that effect.

ADD REPLY • link 4 days ago by GenoMax 154k

0

Entering edit mode

No error is indicated in the log file, but I noticed some files that are not found in the files with paired fastq and also the decontamination in the two files with no paired fastq are 0 kb.

The different files found in the incomplete folder and it also contains decontamination files with 0 kb A complete folder Compare 1 and 2 and see the difference.

ADD REPLY • link 3 days ago by rana.elromh • 0

0

Entering edit mode

Looking at the odd file names (indicative of temporary files) it is possible that your job is getting killed partway, though there should be some indication of that in log e.g. abrupt ending without any clear messages.

Hopefully someone will be along who knows about kneaddata with additional help. Please add information about the exact command line you are using.

ADD REPLY • link 1 day ago by GenoMax 154k

0

Entering edit mode

The code for running kneaddata

ADD REPLY • link 1 day ago by rana.elromh • 0

0

Entering edit mode

Please don't use images to post text content. You can copy and post the text and then format it as code using the 101010 button in edit window.

ADD REPLY • link 1 day ago by GenoMax 154k

0

Entering edit mode

 #!/bin/bash

INPUT_DIR=/mnt/d/metashotgun/fastqc-files/fq.gz
OUTPUT_DIR=/mnt/d/metashotgun/fastqc-files/kneaddata-output
REF_DB=/mnt/d/metashotgun/host_decontamination/kneaddata_c57mouse_db
THREADS=8

mkdir -p "$OUTPUT_DIR/logs"

for file1 in ${INPUT_DIR}/*_1.fq.gz; do
    sample=$(basename "$file1" | sed 's/_1\.fq\.gz//')
    file2="${INPUT_DIR}/${sample}_2.fq.gz"

    echo "Processing $sample ..."
    mkdir -p "$OUTPUT_DIR/$sample"

    kneaddata \
        --input1 "$file1" \
        --input2 "$file2" \
        --output "$OUTPUT_DIR/$sample" \
        --reference-db "$REF_DB" \
        --threads "$THREADS"

    echo "Finished $sample (log: logs/${sample}.log)"
    echo "-----------------------------------------"
done

ADD REPLY • link updated 1 day ago by GenoMax 154k • written 1 day ago by rana.elromh • 0

0

Entering edit mode

Are you using a job scheduler on a cluster or running this on the command line. If the job ends abruptly it is most likely running out of memory. Something you can watch/check for.

You also made a logs directory mkdir -p "$OUTPUT_DIR/logs". Is there anything there?

ADD REPLY • link 1 day ago by GenoMax 154k

0

Entering edit mode

I checked the memory before executing this command. I made a bash script on Ubuntu to run it on 15 folders. But, this problem happened in folder number 6 and 8 only not the last two.

ADD REPLY • link 15 hours ago by rana.elromh • 0