Entering edit mode
3.9 years ago
MAPK
★
2.1k
I am trying to revert bam to fastq. I am using docker to do this. My docker image was running fine for samples from other projects, but while working on this new project I am getting these errors. Can someone please help me resolve the problem I am having here. Thanks for your help in advance.
The command I am executing is this:
# RevertSam
if [ ! -z "${TIMING}" ]; then TIMING=(/usr/bin/time -v); fi
JAVAOPTS="-Xms2g -Xmx${MEM}g -XX:+UseSerialGC -Dpicard.useLegacyParser=false"
CUR_STEP="RevertSam"
start=$(${DATE}); echo "[$(display_date ${start})] ${CUR_STEP} starting"
"${TIMING[@]}" /usr/bin/java ${JAVAOPTS} -jar "${PICARD}" \
"${CUR_STEP}" \
-I "${BAMFILE}" \
-O /dev/stdout \
-SORT_ORDER queryname \
-COMPRESSION_LEVEL 0 \
-VALIDATION_STRINGENCY SILENT \
| /usr/bin/java ${JAVAOPTS} -jar "${PICARD}" \
SamToFastq \
-I /dev/stdin \
-OUTPUT_PER_RG true \
-RG_TAG ID \
-OUTPUT_DIR "${OUT_DIR}" \
-VALIDATION_STRINGENCY SILENT
This is the error I am getting:
INFO 2020-10-02 19:13:52 RevertSam Reverted 1,116,000,000 records. Elapsed time: 02:40:02s. Time for last 1,000,000: 7s. Last read position: */*
INFO 2020-10-02 19:13:59 RevertSam Reverted 1,117,000,000 records. Elapsed time: 02:40:09s. Time for last 1,000,000: 6s. Last read position: */*
INFO 2020-10-02 19:14:05 RevertSam Reverted 1,118,000,000 records. Elapsed time: 02:40:15s. Time for last 1,000,000: 6s. Last read position: */*
[Fri Oct 02 19:14:21 CDT 2020] picard.sam.SamToFastq done. Elapsed time: 160.53 minutes.
Runtime.totalMemory()=2075918336
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.NullPointerException
at picard.sam.SamToFastq$FastqWriters.access$300(SamToFastq.java:500)
at picard.sam.SamToFastq.handleRecord(SamToFastq.java:314)
at picard.sam.SamToFastq.doWork(SamToFastq.java:206)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
[Fri Oct 02 19:14:21 CDT 2020] picard.sam.RevertSam done. Elapsed time: 160.53 minutes.
Runtime.totalMemory()=2076049408
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.util.RuntimeIOException: Write error; BinaryCodec in writemode; streamed file (filename not available)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:222)
at htsjdk.samtools.util.BinaryCodec.writeByteBuffer(BinaryCodec.java:188)
at htsjdk.samtools.util.BinaryCodec.writeByte(BinaryCodec.java:199)
at htsjdk.samtools.util.BinaryCodec.writeByte(BinaryCodec.java:203)
at htsjdk.samtools.util.BlockCompressedOutputStream.writeGzipBlock(BlockCompressedOutputStream.java:434)
at htsjdk.samtools.util.BlockCompressedOutputStream.deflateBlock(BlockCompressedOutputStream.java:409)
at htsjdk.samtools.util.BlockCompressedOutputStream.write(BlockCompressedOutputStream.java:305)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:212)
at htsjdk.samtools.BAMRecordCodec.encode(BAMRecordCodec.java:168)
at htsjdk.samtools.BAMFileWriter.writeAlignment(BAMFileWriter.java:144)
at htsjdk.samtools.SAMFileWriterImpl.close(SAMFileWriterImpl.java:210)
at picard.sam.RevertSam$RevertSamWriter.close(RevertSam.java:685)
at picard.sam.RevertSam.doWork(RevertSam.java:318)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:101)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at htsjdk.samtools.util.BinaryCodec.writeBytes(BinaryCodec.java:220)
... 16 more
Command exited with non-zero status 1
Command being timed: "/usr/bin/java -Xms2g -Xmx6g -XX:+UseSerialGC -Dpicard.useLegacyParser=false -jar /usr/bin/picard.jar RevertSam -I /RAW/WGS/test^LP6005117-DNA_G04^test_WGS/LP6005117-DNA_G04.bam -O /dev/stdout -SORT_ORDER queryname -COMPRESSION_LEVEL 0 -VALIDATION_STRINGENCY SILENT"
User time (seconds): 9148.97
System time (seconds): 212.47
Percent of CPU this job got: 97%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:40:34
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 2474540
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 1
Minor (reclaiming a frame) page faults: 13058533
Voluntary context switches: 226054
Involuntary context switches: 108827
Swaps: 0
File system inputs: 1111056
File system outputs: 300051776
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1
Have you tried BEDTools' bamtofastq?
Hi Kevin, Thanks for your reply. This is the part of the pipeline I am using for many projects and can't use other tools to do this.
Pierre will likely pick this up when he logs in
Write failures are usually either network issues or disk space issues. Is your disk full or disk quota limit reached?
You are not really helping us by not including all necessary information in your question. For example, there are a dozen bash variables for which we don't know the values.
If you read the output log really carefully, you will see what I suspect is the cause of the failure. See this snippet:
Look at the input file name, it has ^ characters in a folder name:
I don't think it's because of how the folder is named. We name our folders by
SampleName^Barcode^Project
. It was certainly not the case with other numerous projects we have and we used the same pipeline for those projects.This particular BAM may be corrupted. Have you tried regenerating or obtaining another copy?
What does
ls -lh ${BAMFILE}
shows?Is this an internally developed pipeline?
Yes. It is an internally developed pipeline. We have been using this for thousands of samples without any problem. Just encountered this error with these particular bam files.
This is the output of
ls -lh ${BAMFILE}
: