Comparing PopoolationTE2 output with reference .bed file in Python
0
0
Entering edit mode
21 months ago
Emilia • 0

I obtained the output file from PopoolationTE2 for my sample which generates TE insertions sites. It looks like that (col2 is the chromosome number, col3 - position, col5 - TE family):

1   1   4254339 .   hAT|9   hAT R   -   0,954
1   1   34804000    .   Stowaway|41 Stowaway    R   -   1,000
1   1   12839440    .   Tourist|15  Tourist F   -   1,000
1   1   11521962    .   Tourist|10  Tourist R   -   1,000
1   1   28197852    .   Tourist|11  Tourist F   -   1,000
1   1   7367886 .   Stowaway|36 Stowaway    R   -   1,000
1   1   13130538    .   Stowaway|36 Stowaway    R   -   1,000
1   1   6177708 .   hAT|4   hAT F   -   1,000
1   1   3783728 .   hAT|20  hAT F   -   1,000
1   1   10332288    .   uc|12   uc  R   -   1,000
1   1   15780052    .   uc|5    uc  R   -   1,000
1   1   28309928    .   uc|5    uc  R   -   1,000
1   1   31010266    .   uc|33   uc  R   -   0,967
1   1   4758653 .   uc|10   uc  F   -   1,000
1   1   3815830 .   uc|31   uc  R   -   0,879
1   1   5037968 .   Mutator|4   Mutator F   -   1,000

I want to compare it with the bed file representing TE sites for the reference genome. It looks like that:

1   12005   12348   RefBeet_TSD_Len:3_Tourist|7
1   56229   56700   RefBeet_TSD_Len:8_hAT|9
1   66241   66528   RefBeet_TSD_Len:9_Mutator|21
1   81966   82251   RefBeet_TSD_Len:2_Stowaway|39
1   84155   84402   RefBeet_TSD_Len:2_uc|1
1   84714   84841   RefBeet_Unknow_un_uc|28
1   98136   98349   RefBeet_TSD_Len:2_Stowaway|3
1   102325  102582  RefBeet_TSD_Len:2_Stowaway|12
1   103132  103267  RefBeet_Unknow_un_uc|33
1   108250  108580  RefBeet_TSD_Len:3_Tourist|17
1   115434  115695  RefBeet_Unknow_Len:8_uc|9

I want to check if TE insertions found in my sample occur in the reference, for example, if the first TE: hAT|9 in position on chromosome 1 in 4254339 will be found in the bed file in the range defined by column 2 as the start and 3 as the end.

I try to do it with pandas but I'm pretty confused.

Thanks for the suggestions!

bed python pandas PopoolationTE2 • 332 views
ADD COMMENT

Login before adding your answer.

Traffic: 1840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6