liftOver error: not reading the coordinate position correctly?
2
1
Entering edit mode
7.7 years ago
Neal ▴ 60

Hi all,

I am getting an error while running the liftOver tool locally on my Mac OSX

The code I execute is as follows:

~/Downloads/liftOver ./ICBP-summary-Nature.bed ~/Downloads/hg18ToHg19.over.chain.gz ./converted_icbp_coordinates.bed ./unlifted_icbp_coordinates.bed


Reading liftover chains
Mapping coordinates
Expecting integer field 2 line 2146483 of ./ICBP-summary-Nature.bed, got 6.5e+07


So I checked the data on this particular line and the data seems to be fine as follows by running sed as

sed -n '2146843p' ICBP-summary-Nature.bed


and I get the following output for the line

chr16    65819026    65819027    rs3852699


The consequence of this error is that the program breaks at this line number so my remaining snps (~300,000) in the subsequent lines do not get converted.

I thought I should post this query on the UCSC Genome Support Forum here, but the registration seems to have closed?

Thank you for going through my query and I would be grateful for any tips or suggestions on how to resolve this?

Update:

At the request of komal, I am posting a few lines which appear before and after this line. So the following are some of the lines before 2146843(including 2146843) are:

chr16    65787923    65787924    rs3730406
chr16    65790185    65790186    rs11700
chr16    65791635    65791636    rs8058861
chr16    65791780    65791781    rs8059662
chr16    65794844    65794845    rs13336793
chr16    65798783    65798784    rs12051247
chr16    65798928    65798929    rs12051249
chr16    65802113    65802114    rs16957240
chr16    65811920    65811921    rs7193713
chr16    65817068    65817069    rs7196793
chr16    65817164    65817165    rs7196989
chr16    65819026    65819027    rs3852699


And the following are some of the lines afterwards:

chr16    65819026    65819027    rs3852699
chr16    65819348    65819349    rs6499116
chr16    65822861    65822862    rs6499118
chr16    65829359    65829360    rs3852700
chr16    65829878    65829879    rs16957265
chr16    65835167    65835168    rs6499119
chr16    65842341    65842342    rs6499121
chr16    65847342    65847343    rs9940665
chr16    65847656    65847657    rs9931407
chr16    65850656    65850657    rs8064216
chr16    65855664    65855665    rs8053031

SNP hg19 liftOver hg18 • 4.4k views
0
Entering edit mode

Did you check if there is a space(s) instead of a tab between the first and the second field on this line?

0
Entering edit mode

Hei komal, and thanks for the comment. The original file was a CSV file which I converted to a tab delimited file through perl..so I am pretty sure there ought to be a tab between the first and second field on this line...

0
Entering edit mode

Can you show some lines that appear before and after this line?

0
Entering edit mode

Hei komal, I have just updated the post with some of the lines

0
Entering edit mode

I got no error when parsing this bed file to liftover (commandline version). Did you try the web version and see if you are getting the same error? Though, it might not be a good idea if you have a very large bed file.

0
Entering edit mode

Hmm you are correct. I tried it at home and these lines were converted perfectly. The file is the usual size for human snps (~2.5 million snps) so, the web version may not work. Maybe I could try to split the input file into ~1.2 million line chunks and then join the 2 output files (assuming it does not break again)

2
Entering edit mode
7.7 years ago
Neal ▴ 60

Ok so I finally found out what was going wrong.

I misread line 2146483 as 2146843. Hence I could not find the error.

I split the file into two using the linux split command

split -l 1230663 ./Blood_pressure/ICBP-summary-Nature.bed ./Blood_pressure/split_blood_pressure

And that is when I realized my oversight as the 2nd file also gave the same error albeit in a different line number obviously which I fortunately read correctly!

Here is what the erroneous line in the file looked like:

chr16    6.5e+07    65000001    rs2289150

I changed it using sed as follows:

sed -i '' 's/6.5e+07/65000000/' split_blood_pressureab

Then I joined the two files through ´cat´and proceeded as usual with liftOver.

My sincere thanks to komal for taking the time to help me out.

0
Entering edit mode
6.7 years ago
morovatunc ▴ 540

Say I did the process successfully. Is there a way to verify the lifting over correct?