Question: why does this pipe work
2
gravatar for nkinney06
6 months ago by
nkinney0620
nkinney0620 wrote:

In looking for a way to check if a bamfile is truncated I noticed that this does the job very quickly.

samtools view file.bam | 1>/dev/null

The thing is I typed the pipe by accident; the command doesn't make sense to me but when you remove the pipe the command takes much longer (Im not sure it ever finishes). My question is why does this work/what is going on.

bash software error • 425 views
ADD COMMENTlink modified 6 months ago by John12k • written 6 months ago by nkinney0620
2
gravatar for Petr Ponomarenko
6 months ago by
United States / Los Angeles / ALAPY.com
Petr Ponomarenko2.4k wrote:

When you run without the pipe "|" symbol

samtools view file.bam 1>/dev/null

1 is interpreted as stdout so it is exactly the command we all are very used to:

samtools view file.bam > /dev/null

You know it. This is a way to read bam file without header into a SAM file if you replace "/dev/null" with "your.SAM". It takes very long time. /dev/null just discards all the data that was put into it, put it actualy first has to get it from the samtools. The whole file is read despite this /dev/null anyway.

When you add the "|" pipe symbol the way you mentioned:

samtools view file.bam | 1>/dev/null

Your first command tries to read the whole bam file without the header and pipe it (kind of sam file) via stdout into a second command, in your case it is "1". Most likely you do not have program or script called "1" in your system. So the second command will end immediately with an error and as it is the last command in your pipe before redirect it will terminate the whole pipe. What you see on your screen is what the first command had time to write as an error message into stderr since stdout was redirected with pipe. When bam file is truncated stderr contains information that file was truncated when it is a valid bam file stderr is empty so no message. You can test all this by running this:

samtools view file.bam | biostars

since you most likely do not have a program called "biostars" in your system it will behave almost as your "samtools view file.bam | 1>/dev/null" other than it will clearly tell you that there was no "biostars" command found. Then you can run command

samtools view file.bam 2 > err.txt

terminate it immediately with Control+C since it will try to print the whole sam file into your stdout. Now take a look at the contents of the err.txt. It will be empty for proper bam and will have an error message for truncated one. This means that samtools already raised an error before you terminated it manually a moment after it started to read a bam file. So if you do

samtools view file.bam 2 > err.txt | wow

you will get information whether bam is normal or truncated in err.txt file if you do not have program called "wow" in your system =)

ADD COMMENTlink modified 6 months ago • written 6 months ago by Petr Ponomarenko2.4k
3
samtools view file.bam | 1>/dev/null
  

Your first command tries to read the whole bam file without the header and pipe it (kind of sam file) via stdout into a second command, in your case it is "1". Most likely you do not have program or script called "1" in your system. So the second command will end immediately with an error

Actually not: only if a space is inserted (unlike in the original example), 1 will be taken as a command name.

$ samtools view file.bam | 1>/dev/null
$ samtools view file.bam | 1 >/dev/null
-bash: 1: command not found
ADD REPLYlink modified 6 months ago • written 6 months ago by Charles Plessy2.3k

This is an interesting point. The output is different in apearance, but my understanding was that metacharacters | > space tab < ; & ( ) are all parsed by bash to separate words so 1 will be treated as a command in both cases. Probably we need a person that better understands bash to explain why stderr is printed in one situation but not the other and if 1 is not a command if there is no space between 1 and >.

ADD REPLYlink modified 6 months ago • written 6 months ago by Petr Ponomarenko2.4k

That would be a good question for UNIX StackExchange. I tried

echo "Hello" 1>tmp #cat tmp yields "Hello"
echo "Hello" 2>tmp >&2 #cat tmp yields "Hello", but this is from stderr
echo "Hello" | 1>tmp #cat tmp yields nothing
echo "Hello" | 2>tmp #cat tmp yields nothing
echo "Hello" >&2 | 2>tmp #"Hello" is printed to stderr (console), cat tmp yields nothing since pipe only uses stdout

Nowhere do I see an error, so I'm sure 1> is not being taken as a command anywhere.

ADD REPLYlink written 6 months ago by Ram12k
2

It's because the stuff after the | isn't being fed to a command, or rather, it's being fed to a null command, the output of which (which is nothing) is sent to tmp. So the 1>tmp bit is working exactly as you expect, it's just that there's an invisible null command between the | and the 1. "true" and "false" are also both null commands, so you can replicate with them:

echo "Hello" | false 1>tmp #cat tmp yields nothing
ADD REPLYlink modified 6 months ago • written 6 months ago by John12k
1

The command being executed is equivalent to :, as I understand. Bash continues to amaze me.

ADD REPLYlink modified 6 months ago • written 6 months ago by Ram12k
2
gravatar for Charles Plessy
6 months ago by
Charles Plessy2.3k
Japan
Charles Plessy2.3k wrote:

Actually, your command does not work: if you would replace /dev/null by a file name, you would see that the resulting file is empty when the pipe symbol is present.

This said, I do not understand why it is not a syntax error.

ADD COMMENTlink written 6 months ago by Charles Plessy2.3k
2

Why is it not a syntax error? Because >/dev/null is a valid shell expression in itself. The way the bash parser is implemented it seems to accept

shell-exp -> shell-exp [ | shell-exp]
shell-exp -> redirection   # e.g. >/dev/null
shell-exp -> variable-exp # e.g. $VAR
...
ADD REPLYlink modified 6 months ago • written 6 months ago by Michael Dondrup43k

Thank ! Here is also an interesting link to an answer in StackOverflow, posted by Alex Reynolds in an answer that mysteriously disapeared from this discussion.

ADD REPLYlink modified 6 months ago • written 6 months ago by Charles Plessy2.3k

If you want a shell that is more strict and throws an error in this case, try csh:

% echo hi | > /dev/null
 Invalid null command.
ADD REPLYlink modified 6 months ago • written 6 months ago by Michael Dondrup43k

Why would someone switch to csh from any Bourne-flavor shell?

ADD REPLYlink written 6 months ago by Ram12k
2
gravatar for John
6 months ago by
John12k
Germany
John12k wrote:

To do this without using samtools, you can do something like:

Also note, this method of detecting file integrity, which is ultimately what a truncated file is, is bad. One should validate the BAM fully, and then produce a checksum. Before anything meaningful is calculated from the BAM file, the checksum should be regenerated and compared. Many tools I imagine will put an EOF marker on the end of a prematurely-made BAM file. Two files concatenated together will also have the correct EOF marker. For all intents and purposes, EOF markers are a dumb way to infer the integrity of a file.

ADD COMMENTlink modified 6 months ago • written 6 months ago by John12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1358 users visited in the last hour