SearchIO/HmmerIO assert error python
0
0
Entering edit mode
7.0 years ago
CrLs ▴ 10

Hi everyone,

I'm using SearchIO module for python to parse my HMM3 domtab files.

I'm using hmmscan and domtblout files. Problem is when i want to parse my results i get an assert error : assert len(cols) == 23

I found out that in hmmer3_domtab.py those line and i get why.

def _parse_row(self):
    """Returns a dictionary of parsed row values."""
    assert self.line
    cols = [x for x in self.line.strip().split(' ') if x]
    # if len(cols) > 23, we have extra description columns
    # combine them all into one string in the 19th column
    if len(cols) > 23:
        cols[22] = ' '.join(cols[22:])
    elif len(cols) < 23:
        cols.append('')
        assert len(cols) == 23

Point is i dont get why i get this error for the last hit of everyfiles or if there is only one hit by file (even if there is the same number of cols in each hit)

--- full sequence --- -------------- this domain ------------- hmm coord ali coord env coord

target name accession tlen query name accession qlen E-value score bias # of c-Evalue i-Evalue score bias from to from to from to acc description of target

------------------- ---------- ----- -------------------- ---------- ----- --------- ------ ----- --- --- --------- --------- ------ ----- ----- ----- ----- ----- ----- ----- ---- ---------------------

profil1 - 834 hitname|-|1356354..1357713 - 452 2.6e-08 18.2 0.0 1 1 3.7e-08 3.7e-08 17.7 0.0 439 530 198 289 159 295 0.84 - profil1 - 834 hitname|-|1357766..1360262 - 831 1.5e-147 478.8 16.2 1 1 1.6e-147 1.6e-147 478.6 16.2 4 833 2 830 1 831 0.96 - #

Program: hmmscan

Version: 3.1b2 (February 2015)

Pipeline mode: SCAN

[ok]

I get the error for the last hit, but not for the first one.

Here part the code i use

try :   
    for qresult in SearchIO.parse(handle, 'hmmscan3-domtab'):
            query_info = qresult.id  #sequence ID from fasta
            query_info = query_info.split('|')
            query_ID = query_info[0]
            query_strand = query_info[1]
            query_pos = query_info[2]
            query_pos = query_pos.split('..')
            query_start = query_pos[0]
            query_end = query_pos[1]
            query_len = qresult.seq_len
            hits = qresult.hits
            align = qresult.fragments
            num_hits = len(hits)
            count = 0
            longueur_align_query = 0
            longueur_align_hit = 0              
except AssertionError:
    print('better luck next time')
    pass

I raised the exception to keep on going because the script would just stop if not. The problem is i just dont parse most of the results... Does anyone know how to fix this ?

PS : Sorry for my broken english :'(

sequence software error hmmscan python • 1.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 1476 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6