iterating over LiftOver in big dataset
0
0
Entering edit mode
2.3 years ago
ManuelDB ▴ 80

Some has work with LiftOver in big dataset?

I have a big dataset with genes coordinates and I am converting these coordinates from hg19 to hg38. I am using the python package LiftOver.

The problem is that this does not always return the same output structure as they say:

Returns a list of possible conversions for a given chromosome position. The list may be empty (no conversion), have a single element (unique conversion), or several elements (position mapped to several chains). The list contains tuples (target_chromosome, target_position, target_strand, conversion_chain_score), where conversion_chain_score is the "alignment score" field specified at the chain used to perform conversion. If there are several possible conversions, they are sorted by decreasing conversion_chain_score.

So, when I tried to iterate over my list of 19677,

with this

for index, row in loeuf.iterrows():

tmp += lo.convert_coordinate(row['chromosome'], row['start_position'])

I lost a few genes coordinates

len(tmp)
19672

I don't really mind if I delete these few genes, but the problem is that I also lost the order and when I add the new coordinates with the original dataset and I check if the coordinates of these genes in the new reference genome are correct. It is fine in the first genes but when I check the last ones are completely different

Example:

chromosome_19   start_position_19   chromosome_38   star_position_38
    0   chr19   58345178    chr19   58353499
    1   chr10   50799409    chr10   50885675
    2   chr12   9067664         chr12   9116229
    3   chr12   8822472         chr12   8887001
    4   chr1    33306766            chr1    33321098
    ... ... ... ... ...
   19666    chr1    52726454    chr7    143391111
   19667    chr7    143381080   chr17   4143020 
   19668    chr17   4004445         chr1    77683419
   19669    chr1    77562416    chr19   14075062

Any idea how I can do this?? I have been working around unlucky.

pyliftover LiftOver • 529 views
ADD COMMENT
0
Entering edit mode

can you please stop asking questions without commenting and/or validating people's answers. ManuelDB

ADD REPLY
0
Entering edit mode

What do you mean, I have validated all previous answers and some of them have been commented on?

EDIT: I see what you mean. I have been upvoting answer but not validating. Sorry

ADD REPLY

Login before adding your answer.

Traffic: 1415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6