Question: liftOver error: not reading the coordinate position correctly?
0
gravatar for Neal
4.9 years ago by
Neal40
Norway
Neal40 wrote:

Hi all,

I am getting an error while running the liftOver tool http://genome.sph.umich.edu/wiki/LiftOver locally on my Mac OSX

The code I execute is as follows:

~/Downloads/liftOver ./ICBP-summary-Nature.bed ~/Downloads/hg18ToHg19.over.chain.gz ./converted_icbp_coordinates.bed ./unlifted_icbp_coordinates.bed

The error reads as following:

Reading liftover chains

Mapping coordinates

Expecting integer field 2 line 2146483 of ./ICBP-summary-Nature.bed, got 6.5e+07

 

So i checked the data on this particular line and the data seems to be fine as follows by running sed as 

 sed -n '2146843p' ICBP-summary-Nature.bed 

and I get the following output for the line

chr16    65819026    65819027    rs3852699

The consequence of this error is that the program breaks at this line number so my remaining snps (~300,000) in the subsequent lines do not get converted. 

I thought I should post this query on the UCSC Genome Support Forum here  http://redmine.soe.ucsc.edu/forum/index.php, but the registration seems to have closed?

Thank you for going through my query and I would be grateful for any tips or suggestions on how to resolve this?

Update: 

At the request of komal, I am posting a few lines which appear before and after this line. So the following are some of the lines before 2146843(including 2146843) are:

chr16    65787923    65787924    rs3730406

chr16    65790185    65790186    rs11700

chr16    65791635    65791636    rs8058861

chr16    65791780    65791781    rs8059662

chr16    65794844    65794845    rs13336793

chr16    65798783    65798784    rs12051247

chr16    65798928    65798929    rs12051249

chr16    65802113    65802114    rs16957240

chr16    65811920    65811921    rs7193713

chr16    65817068    65817069    rs7196793

chr16    65817164    65817165    rs7196989

chr16    65819026    65819027    rs3852699

And the following are some of the lines afterwards:

chr16    65819026    65819027    rs3852699

chr16    65819348    65819349    rs6499116

chr16    65822861    65822862    rs6499118

chr16    65829359    65829360    rs3852700

chr16    65829878    65829879    rs16957265

chr16    65835167    65835168    rs6499119

chr16    65842341    65842342    rs6499121

chr16    65847342    65847343    rs9940665

chr16    65847656    65847657    rs9931407

chr16    65850656    65850657    rs8064216

chr16    65855664    65855665    rs8053031
hg19 liftover snp hg18 • 2.6k views
ADD COMMENTlink modified 3.9 years ago by morovatunc400 • written 4.9 years ago by Neal40

Did you check if there is a space(s) instead of a tab between the first and the second field on this line?

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by komal.rathi3.5k

Hei komal, and thanks for the comment. The original file was a CSV file which I converted to a tab delimited file through perl..so I am pretty sure there ought to be a tab between the first and second field on this line...

ADD REPLYlink written 4.9 years ago by Neal40

Can you show some lines that appear before and after this line?

ADD REPLYlink written 4.9 years ago by komal.rathi3.5k

Hei komal, I have just updated the post with some of the lines

ADD REPLYlink written 4.9 years ago by Neal40

I got no error when parsing this bed file to liftover (commandline version). Did you try the web version and see if you are getting the same error? Though, it might not be a good idea if you have a very large bed file.

ADD REPLYlink modified 4.9 years ago • written 4.9 years ago by komal.rathi3.5k

Hmm you are correct. I tried it at home and these lines were converted perfectly. The file is the usual size for human snps (~2.5 million snps) so, the web version may not work. Maybe I could try to split the input file into ~1.2 million line chunks and then join the 2 output files (assuming it does not break again)

ADD REPLYlink written 4.9 years ago by Neal40
2
gravatar for Neal
4.9 years ago by
Neal40
Norway
Neal40 wrote:

Ok so I finally found out what was going wrong.

I misread line 2146483 as 2146843. Hence I could not find the error.

I split the file into two using the linux split command

split -l 1230663 ./Blood_pressure/ICBP-summary-Nature.bed ./Blood_pressure/split_blood_pressure

And that is when I realized my oversight as the 2nd file also gave the same error albeit in a different line number obviously which I fortunately read correctly!

Here is what the erroneous line in the file looked like:

chr16    6.5e+07    65000001    rs2289150

I changed it using sed as follows:

sed -i '' 's/6.5e+07/65000000/' split_blood_pressureab 

Then I joined the two files through ´cat´and proceeded as usual with liftOver.

My sincere thanks to komal for taking the time to help me out.

ADD COMMENTlink written 4.9 years ago by Neal40
0
gravatar for morovatunc
3.9 years ago by
morovatunc400
Turkey
morovatunc400 wrote:

Say I did the process successfully. Is there a way to verify the lifting over correct?

ADD COMMENTlink written 3.9 years ago by morovatunc400
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 967 users visited in the last hour