Simple Redirection, I/O Problem With Bedtools
10.0 years ago

Hi Guys, Just a quick question. Its more of a Bash question rather than Bioinformatics, with Bedtools in question.

I mostly pipe the bedtools I/O. Here's a general scenario :

sed 1d fileA.bed | intersectBed -a stdin -b peaks.bed | intersectBed -u -a stdin -b fileB.bed

Now, the problem is fileB is also having a head, which is reported as an error by intersectBed (makes sense, non-integer start).

How can I remove the first line or the head of the fileB on the fly in the pipe.

Thanks

bedtools bash pipeline linux intersect • 3.4k views
10.0 years ago
Ryan Dale 5.0k

As of 2.13.0 bedtools supports FIFOs, so you can do:

intersectBed -a <(sed 1d fileA.bed) -b <(sed 1d peaks.bed) | intersectBed -u -a stdin -b fileB.bed

If I am right, this is a bash trick, which can be used with any programs, irrelevant of bedtools and its versions. But as it is shell dependent, you cannot use it in a C-shell.

That's right, FIFOs are a shell technology, not a bedtools technology. What @Daler is describing is that, owing to a silly mistake, early versions of bedtools were incapable of using FIFOs as input.

More generally, sed can be used to extract any part of a multiline text file, between lines n and m and you can do a similar trick by piping head and tail, see e.g. http://linux.byexamples.com/archives/130/head-and-tail/

10.0 years ago
Fwip ▴ 490

Have you tried using named pipes? Your commands would basically be the same as those listed by Damian, except with an additional mkfifo command at the front.

mkfifo tmp_pipe ;
sed 1d fileB.bed > tmp_pipe &
sed 1d fileA.bed | intersectBed -a stdin -b peaks.bed | intersectBed -u -a stdin -b tmp_pipe ;
rm -f tmp_pipe ;


(I split this into separate lines for ease of reading).

Edit: lh3 is right, you need to write to the pipe in a background process, or your shell will stall while waiting for the program to close. The commands have been updated. Thanks for the correction!

One small thing: you need to put the second line to the background. i.e. you need "&" at the end of the 2nd line.

10.0 years ago

Might be easier to just use several commands separated by semi-colon:

sed 1d fileB.bed > fileB.temp.bed;sed 1d fileA.bed | intersectBed -a stdin -b peaks.bed | intersectBed -u -a stdin -b fileB.temp.bed;rm -rf fileB.temp.bed

