Question: Using xargs to tabix each line of a bed to a vcf
0
gravatar for jvijai
6 weeks ago by
jvijai1.2k
United States
jvijai1.2k wrote:

I want to write out each line of BED (region) to its own vcf using tabix

Here is what I was attempting, but its not working

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | xargs -n1 tabix -fh {} 21.ACANAFCR_sorted.vcf.gz >Chr{}.sorted.vcf

I am not sure I am using the {} in xargs properly.

The error I get

[E::hts_open_format] Failed to open file {}

Could not read {}

awk xargs tabix vcf • 102 views
ADD COMMENTlink modified 6 weeks ago by ATpoint28k • written 6 weeks ago by jvijai1.2k

Removed by the author.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by massa.kassa.sc3na230
0
gravatar for ATpoint
6 weeks ago by
ATpoint28k
Germany
ATpoint28k wrote:

The syntax of tabix is wrong. It needs to be tabix (options) file.vcf.gz {regions}:

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | xargs -n1 tabix -fh 21.ACANAFCR_sorted.vcf.gz {}

but I am not sure if the redirection to the file will work. What will work is the same with parallel:

awk '{print $1":"($2+1)"-"$3}' CHR21_RegionsforBeagle.bed | parallel "tabix -fh 21.ACANAFCR_sorted.vcf.gz {} > Chr{}.sorted.vcf"

You can use the parallel parameter -j to limit the number of parallel jobs to something reasonable like maybe 10 to avoid excessive I/O operations on the same file.

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by ATpoint28k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1418 users visited in the last hour