Unzipping bz2 files
2
0
Entering edit mode
8.0 years ago
espop23 ▴ 60

Hello,

I have around 50 files that end with .bz2 What command can I run in the terminal to collectively unzip all? I have tried bzip2 -d NT*.bz2 (for filenames that all start with "NT_..." but it is rather slow..

Thanks

zip terminal • 4.3k views
ADD COMMENT
0
Entering edit mode

ls NT*.bz2|xargs -i echo bzip2 -dc {}|parallel -j8 This decompresses 8 files in parallel. You can also directly use parallel without xargs.

ADD REPLY
0
Entering edit mode

Hello espop23!

We believe that this post does not fit the main topic of this site.

Sorry, this isn't a bioinformatics question.

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY
3
Entering edit mode

I think people should refrain from closing posts that have bioinformatics relevance especially after the post is answered.

Unzipping lots of files is extremely common task in bioinformatics and it is annoying that the normal operation that the OP would want does not work. Sending them away especially when most people here can answer it and is of relevance to day to day work in bioinformatics does not help improve the content of the site.

ADD REPLY
0
Entering edit mode

A difference of opinion, I guess having come from the original Stack* sites where protocol is more rigorously enforced, to me this is hugely offtopic. I don't particularly want Biostars to turn into a command line helpdesk for people. I felt even better about closing it because there was an accepted answer and a secondary correct answer, there's no need to add to the thread any further - so why not close it?

Still abiding by the 'we'll be happy to talk about it' ;)

ADD REPLY
2
Entering edit mode
8.0 years ago
GenoMax 141k

Use the following.
bunzip2 *.bz2
If "slow" means the files are processed serially then you could try start a few jobs via a for loop and put them in background. This would still be slow since you are probably running this on a single computer and you are going to saturate the I/O after a certain number of jobs.
If you have access to a cluster then you could start them as 50 independent jobs.

ADD COMMENT
1
Entering edit mode
8.0 years ago

You can use lbzip2, which is a improved and parallel version of bunzip2. This should allow to decompress the files using multiple processors.

ADD COMMENT

Login before adding your answer.

Traffic: 2481 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6