Question: bgzip all VCFs in a directory
2
gravatar for stevenlang123
5.5 years ago by
stevenlang123180
United States
stevenlang123180 wrote:

Hi guys,

So I've been trying to bgzip around 100 VCF files in parallel, but although the jobs are submitted and files get created there's definitely something wrong.

So far I've been trying:

$ for file in *.vcf
> do
> bsub /foo/bar/bgzip $file
>> $file.gz

What is the correct way to do this?

Thanks in advance!

sequencing seq • 5.6k views
ADD COMMENTlink modified 22 months ago by _r_am32k • written 5.5 years ago by stevenlang123180
7
gravatar for Pierre Lindenbaum
5.5 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

Use GNU parallel

parallel bgzip {} ::: *.vcf

or xargs:

ls *.vcf | xargs -P 10 bgzip
ADD COMMENTlink modified 22 months ago by _r_am32k • written 5.5 years ago by Pierre Lindenbaum133k
3

When I use ls *.vcf | xargs -P 10 bgzip it only compresses the first file in the folder. Using -n1 (=use at most 1 argument per command line) instead of -P 10 worked for me:

ls *.vcf | xargs -n1 bgzip
ADD REPLYlink written 2.7 years ago by Scleroz30
1

You can use both (xargs -n1 -P0) to get a computer to use max_procs and dispatch one file (line) per process.

ADD REPLYlink written 20 months ago by _r_am32k

Is there any way to do this using an LSF manager to split the jobs up instead ? The problem I'm having is that bgzip requires to redirect the file output so

bsub < bgzip $file > $file.gz

does not work

ADD REPLYlink modified 22 months ago by _r_am32k • written 5.5 years ago by stevenlang123180

Never mind, working great Pierre! Thanks!

ADD REPLYlink written 5.5 years ago by stevenlang123180
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1388 users visited in the last hour
_