Question: Using xargs to tabix each line of a bed to a vcf
0
gravatar for jvijai
13 months ago by
jvijai1.2k
United States
jvijai1.2k wrote:

I want to write out each line of BED (region) to its own vcf using tabix

Here is what I was attempting, but its not working

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | xargs -n1 tabix -fh {} 21.ACANAFCR_sorted.vcf.gz >Chr{}.sorted.vcf

I am not sure I am using the {} in xargs properly.

The error I get

[E::hts_open_format] Failed to open file {}

Could not read {}

awk xargs tabix vcf • 296 views
ADD COMMENTlink modified 13 months ago by ATpoint44k • written 13 months ago by jvijai1.2k

Removed by the author.

ADD REPLYlink modified 13 months ago • written 13 months ago by massa.kassa.sc3na340
0
gravatar for ATpoint
13 months ago by
ATpoint44k
ATpoint44k wrote:

The syntax of tabix is wrong. It needs to be tabix (options) file.vcf.gz {regions}:

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | xargs -n1 tabix -fh 21.ACANAFCR_sorted.vcf.gz {}

but I am not sure if the redirection to the file will work. What will work is the same with parallel:

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | parallel "tabix -fh 21.ACANAFCR_sorted.vcf.gz {} > Chr{}.sorted.vcf"

You can use the parallel parameter -j to limit the number of parallel jobs to something reasonable like maybe 10 to avoid excessive I/O operations on the same file.

ADD COMMENTlink modified 13 months ago • written 13 months ago by ATpoint44k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1607 users visited in the last hour