[main_samview] fail to read the header from "-".
1
0
Entering edit mode
5 days ago
J • 0

I am attempting to run a file through an algorithm I have been using, HLA*LA. On running the samtools command within the algorithm, I have unfortunately been getting this error. After trying to debug this following other guides, I am seeking assistance here as I have yet to find a solution.

The script is attempting to run the following command:

samtools view -@ 3  IN.cram '*' | awk '{if ($3 == "*") print $0}' | samtools view -bo UNMAPPED.bam -

The specific error that I am getting is the following:

[main_samview] fail to read the header from "-".

When I print out the header using samtools, there is output and the error is not reproduced. When I solely run the first samtools command and the awk command, there is no error or output meaning the issue is with the last samtools command.

Update: Outside of the algorithm, I was able to produce a bam output without issues when using:

samtools -b IN.cram > OUT.bam
HLA-LA samtools • 186 views
ADD COMMENT
1
Entering edit mode

You need to add samtools view -h

 -h, --with-header          Include header in SAM output
ADD REPLY
0
Entering edit mode

I edited the command to:

samtools view -h -@ 3  IN.cram '*' | awk '{if ($3 == "*") print $0}' | samtools view -bo UNMAPPED.bam -

The error occurred again when running the second samtools command. If helpful, I am using samtools version 1.10

ADD REPLY
1
Entering edit mode

Your awk step is filtering most of lines out since you are only printing those which match * in third field. You will either need to let the header lines pass through or filter the file with first two steps. Add the header to the filtered file (you can grab it from original by doing samtools view -H and then create the BAM you need.

ADD REPLY
0
Entering edit mode

The code is meant to identify and snag all the unmatched reads for paired-end data. Would this error just mean that there are no unmatched reads? The file went through a decent amount of QC.

ADD REPLY
1
Entering edit mode
5 days ago
GenoMax 106k

Error is because your awk code is stripping out the header lines (generated by samtools view -h). The header lines are needed when you try to write a valid BAM file with second view command.

You can do this.

Grab the header from your file

samtools view -H your.cram > header.file

Filter your file using your original command

samtools view  -@ 3  IN.cram '*' | awk '{if ($3 == "*") print $0}'  > filtered.sam

Cat the header to top of this filtered file and create your BAM.

cat header.file filtered.sam | samtools view -bo UNMAPPED.bam - 
ADD COMMENT
0
Entering edit mode

Awesome, thanks, that fixed it.

ADD REPLY

Login before adding your answer.

Traffic: 2153 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6