Question: Help required to write a one liner to gunzip (and retain gzipped files) for multiple files.
0
gravatar for lakhujanivijay
3.6 years ago by
lakhujanivijay5.1k
India
lakhujanivijay5.1k wrote:

Hi,

I am trying to learn writing one liners using shell/awk. To begin with I want to perform the following operation using a one liner.

Scenario: A directory containing multiple fastq files with the extension .fq.gz.

Objective To automate gunzipping all the files and retain the gzipped files i.e. my directory will have *.fq.gz as well as *.fq files.

What I tried so far

I broke the problem into smaller pieces like this:

  1. find the files with extension *.fq.gz
  2. loop over the files
  3. pass files one by one to gunzip command
  4. use the option -cto retain the *.fq.gz files
  5. the output files should have extension *.fq, so I split the file name by a "." to have 3 parts from the original file name i.e.

original file

test.fq.gz

after splitting

test [part 1]
fq [part 2]
gz [part3 ]

then take part 1 and concatenate ".fq" . Finally I came up with below one liner:

for i in find -name  "*.fq.gz"; do gunzip -c $i > awk '{split($i,a,"."); print a[1] ".fq"}' ; done

However, it is not working, here is the error:

gunzip: find.gz: No such file or directory
gunzip: {split($i,a,"."); print a[2] ".fq"}.gz: No such file or directory
gunzip: invalid option -- e

PS: I googled a lot without much success. This may be a very easy task but I am learning to automate using one liners.

fastq gunzip • 1.7k views
ADD COMMENTlink modified 3.6 years ago by WouterDeCoster44k • written 3.6 years ago by lakhujanivijay5.1k
2
gravatar for dyollluap
3.6 years ago by
dyollluap300
USA, California, Bay Area
dyollluap300 wrote:

Try

for file in *.gz; do gunzip -k $file ; done

You shouldn't need awk for this situation. By default gunzip will name the uncompressed file the same as the .gz , just minus the .gz I think you want the gunzip -k flag to keep the original fq.gz (see gunzip --man)

ADD COMMENTlink modified 3.6 years ago by WouterDeCoster44k • written 3.6 years ago by dyollluap300

Oh .. I forgot -k. Oh .. I forgot -k.

ADD REPLYlink written 3.6 years ago by shenwei3565.2k

No -k option. Do I have an older version?

for file in *.gz; do gunzip -k $file ; done
gunzip: invalid option -- k
gunzip 1.3.5
(2002-09-30)
usage: gunzip [-cdfhlLnNrtvV19] [-S suffix] [file ...]
 -c --stdout      write on standard output, keep original files unchanged
 -d --decompress  decompress
 -f --force       force overwrite of output file and compress links
 -h --help        give this help
 -l --list        list compressed file contents
 -L --license     display software license
 -n --no-name     do not save or restore the original name and time stamp
 -N --name        save or restore the original name and time stamp
 -q --quiet       suppress all warnings
 -r --recursive   operate recursively on directories
 -S .suf  --suffix .suf     use suffix .suf on compressed files
 -t --test        test compressed file integrity
 -v --verbose     verbose mode
 -V --version     display version number
 -1 --fast        compress faster
 -9 --best        compress better
    --rsyncable   Make rsync-friendly archive
 file...          files to (de)compress. If none given, use standard input.
Report bugs to <bug-gzip@gnu.org>.
gunzip: invalid option -- k
gunzip 1.3.5
(2002-09-30)
ADD REPLYlink written 3.6 years ago by lakhujanivijay5.1k

If you use a gunzip with the -k option, why not simply do gunzip -k *.fq.gz?

ADD REPLYlink written 3.6 years ago by Lars Juhl Jensen11k
0
gravatar for shenwei356
3.6 years ago by
shenwei3565.2k
China
shenwei3565.2k wrote:

No find:

 for f in *.fq.gz; do gzip -d -c $f > ${f/\.gz/}; done

With find:

for f in $( find ./ -name "*.fq.gz" ); do gzip -d -c $f > ${f/\.gz/}; done

I tried find -exec but failed.

find ./ -name "*.fq.gz" -exec gzip -d -c {} > ${"{}"/\.gz/} \;
bash: ${"{}"/\.gz/}: bad substitution

Ref: Advanced Bash-Scripting Guide:: Manipulating Strings

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by shenwei3565.2k

Can you please explain

 for f in *.fq.gz; do gzip -d -c $f > ${f/\.gz/}; done
  1. how is gzip helping here instead of gunzip ?
  2. ${f/\.gz/} what is this part ?
ADD REPLYlink written 3.6 years ago by lakhujanivijay5.1k

gzip can compress and decompress.

see the ref. string replacing in shell.

ADD REPLYlink modified 3.6 years ago • written 3.6 years ago by shenwei3565.2k
0
gravatar for WouterDeCoster
3.6 years ago by
Belgium
WouterDeCoster44k wrote:

Let me throw in a gnu-parallel solution, which isn't so different from the other solutions:

ls *.gz | parallel 'gunzip -k {}'
ADD COMMENTlink written 3.6 years ago by WouterDeCoster44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1044 users visited in the last hour