Running Docker Container Built on GNU Guix on a Mac OS
0
0
Entering edit mode
22 months ago
f.wu • 0

Hello,

I am attempting to run a pipeline https://github.com/UMCUGenetics/SHARC through a docker container on my M1 Macbook and am running into some issues. The packages were created using GNU-Guix and am wondering if running these packages on my Mac operating system is the source of error. While working with the author, we ran the same docker command with the same data and file mounting which is as follows:

sudo docker run --platform linux/amd64 --mount type=bind,source=/Users/waples.lab/Desktop/bio/reference_data,destination=/tmp/data/ --mount type=bind,source=/Users/waples.lab/Desktop/bio/output,destination=/tmp/output/ -it jaesvi/sharc -f /tmp/data/ -o /tmp/output/ -mr /tmp/data/hg19.fa -mhr 4:0:0

Yet I am not able to get the same desired output as he. The only difference in our commands is I specify the --platform flag to linux/amd64, as the M1 Macbook is to my understanding, an arm64 achitecture. I have also tried to omit the flag but I still get the same error.

From the logs, it appears that the error occurs in the mapping portion of the pipeline; the code is through this link:

https://github.com/UMCUGenetics/SHARC/blob/master/steps/minimap2.sh

CMD="$MINIMAP2 -t $THREADS $SETTINGS $REF $FASTQ | \ 
$SAMBAMBA view -h -S --format=bam -t 8 /dev/stdin | \
$SAMBAMBA sort -m9G -t $THREADS --tmpdir=./ /dev/stdin \
-o $OUTPUT

echo $CMD

eval $CMD

if [ -e $OUTPUT ]; then
    NUMBER_OF_READS_IN_FASTQ=$(awk '{s++}END{print s/4}' $FASTQ)
    NUMBER_OF_READS_IN_BAM=$($SAMBAMBA view $OUTPUT | cut -f 1 | sort | uniq | wc -l)
    if [ "$NUMBER_OF_READS_IN_FASTQ" == "$NUMBER_OF_READS_IN_BAM" ]; then
        touch $OUTPUT.done
    else
        echo "Number of reads in the fastq file ($NUMBER_OF_READS_IN_FASTQ) is different than the number of reads in the bam file ($NUMBER_OF_READS_IN_BAM)" >&2
    fi
fi

I figured that the error occurs somewhere in the above portion of code, where I get the error in my log:

Number of reads in the fastq file (187182) is different than the number of reads in the bam file (0)

Could anyone shine some light on whether this issue is due to compatibility issues due to running a docker image built on linux binary with a macbook pro, or an issue with the script that renders a mac unable to use it? The script progresses through til the end, but I am left with no output and I also get an error message in the beginning "basename: missing operand" Thank you in advance for your help, anything is appreciated!

OS Docker GNU Guix Container Mac Minimap2 • 1.1k views
ADD COMMENT
0
Entering edit mode

what is the error that you are getting?

ADD REPLY
0
Entering edit mode

The error is:

Number of reads in the fastq file (187182) is different than the number of reads in the bam file (0)

I am assuming this occurs because minimap2 did not execute properly. What I should have got in my job log is:

minimap2 output

ADD REPLY
0
Entering edit mode

is the input data for this test run included in the docker image or are you downloading that separately? If the latter, I suggest checking the number of line in the file or the md5 checksum to ensure that all of the file was downloaded completely

ADD REPLY
0
Entering edit mode

I have downloaded the input data separately and mounted it to the Docker container with bind mounts. The files are downloaded completely and correctly. I have shared the same files (fastq and reference genome) that I have mounted to the docker container with the author of the pipeline, and he was able to run and get the correct output. This leaves me to wonder if the issue lies with the cross-compatibility of different operating systems. We ran the same docker command, but the only difference is he is using Linux, whereas my machine runs Mac OS.

ADD REPLY

Login before adding your answer.

Traffic: 1353 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6