bedops wig2bed 'Row begins with a tab or space' error
0
0
Entering edit mode
4 weeks ago
a_bis ▴ 10

Hi everyone,

I've been trying to convert a .wig file to a .bed file using bedops' wig2bed function and I get the following error:

Row begins with a tab or space at line 1 in -.

I have tried running it with and without the --zero-indexed option, as well as alternatively running convert2bed instead, but the error message remains unchanged. I'm not sure what the error message refers to, as line 1, as far as I can see, doesn't begin with a tab or space:

head <file.wig>

Is there anything apparently wrong with the format of the .wig file that's leading to this error message, and can you suggest a way to fix it? Thank you!

wig2bed bedops convert2bed • 524 views
ADD COMMENT
2
Entering edit mode

Can you please cut and paste the output of:

$ head in.wig | cat -te

Replace in.wig with the filename of your wig file.

ADD REPLY
0
Entering edit mode

The output looks as follows:

1^I0^I3000060^I0.000000$
1^I3000060^I3000090^I0.224911$
1^I3000090^I3000120^I0.449820$
1^I3000120^I3000150^I0.674733$
1^I3000150^I3000180^I0.899643$
1^I3000180^I3000360^I1.124553$
1^I3000360^I3000390^I0.899643$
1^I3000390^I3000420^I0.674733$
1^I3000420^I3000450^I0.449820$
1^I3000450^I3000480^I0.224911$

Also, thank you for your second suggestion about --multisplit. Is there a way to tell if a wig file contains multiple sections? (This one was generated by averaging three bigwig files using WiggleTools -- I'm not sure what sections it would be split into.) Thanks!

ADD REPLY
1
Entering edit mode

If a wig file contains multiple sections, it will have multiple track and/or other header lines. You could count how many sections there are minimally via the following command or similar:

$ grep -E "^track|^fixedStep|^variableStep" in.wig | wc -l

If you get a value greater than one, you may have multiple sections, and you could investigate the file with a text editor like emacs or vi, etc. to confirm.

To get back to your original question, the above does not look like a wig ("wiggle") file, but a bedGraph file.

It appears to be missing track and other header lines that specify whether it is variable or fixed width (for example):

• https://genome.ucsc.edu/goldenPath/help/wiggle.html • https://genome.ucsc.edu/goldenPath/help/bedgraph.html

To convert bedGraph to sorted, five-column BED format, you can just add a placeholder in the fourth column:

$ awk -v FS="\t" -v OFS="\t" '{ print $1, $2, $3, ".", $4 }' in.bedGraph | sort-bed - > out.bed

Tangentially, this might be related to your problem, but there is a bug in the UCSC toolkit, where a bedGraph file will be written into a bigWig file as-is, and not first converted to wig format:

• http://genome.ucsc.edu/goldenPath/help/bigWig.html#optional

I don't know if you are perhaps starting with a bigWig file. If so and if it contains a bedGraph file, and if you use bigWigToWig, it will not create a wig file as output, but a bedGraph file.

UCSC choosing to ignore its own specifications is a bit frustrating. This might not be the issue you're running into; I'm only mentioning it here in case this might be the real cause.

ADD REPLY
0
Entering edit mode

Thank you very much for the advice! I'm not quite sure what did it in the end, but I tried the following on my WiggleTools-generated .wig file:

wig2bed --zero-indexed --multisplit section --do-not-sort < file.wig > file.bed 

and the code ran successfully. That is, until the chromosome names started not being recognised and I started getting

Error: Invalid WIG line < number >.

messages. I eventually changed all instances of "MT" to "chrM", "X" to "chrX", "Y" to "chrY" and all the "weird" chromosomes such as "GL456213.1" to "chrGL456213" using sed. This did the trick and wig2bed now seems to have worked on the entire file.

ADD REPLY
1
Entering edit mode

Thanks for reporting back on the fix.

This tool is fairly old and was written to UCSC's specification for chromosome names, which historically are prefixed with chr.

I'll add an issue ticket to the Github site referencing this. I can't imagine this being too difficult to generalize for any chromosome name scheme.

ADD REPLY
1
Entering edit mode

Thank you for all your help!

ADD REPLY
0
Entering edit mode

Fixed in the v2p4p40 branch: https://github.com/bedops/bedops/commit/d9776fdd215e8264b689081c4c7c98482f02b3e2

Should be pushed to production in a week or so.

ADD REPLY
2
Entering edit mode

Also, if your wig file contains multiple sections, you may want to add the --multisplit foo option.

See: https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/wig2bed.html

ADD REPLY

Login before adding your answer.

Traffic: 2504 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6