Entering edit mode
                    9.0 years ago
        themantalope
        
    
        ▴
    
    40
    Hi All,
I have a .dat file that follows the formatting of the Swissprot sequence format file, and I'm trying to read it using Biopython's SeqIO module. However, when I try to extract records from the file I get the following error:
>>> reqs = list(SeqIO.parse("5UTRaspic.Hum.dat", "swiss"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 600, in parse
    for r in i:
  File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SeqIO/SwissIO.py", line 85, in SwissIterator
    for swiss_record in swiss_records:
  File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 121, in parse
    record = _read(handle)
  File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 165, in _read
    _read_id(record, line)
  File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 278, in _read_id
    raise ValueError("ID line has unrecognised format:\n" + line)
ValueError: ID line has unrecognised format:
ID   5HSAA000001; SV 1; linear; mRNA; STD; HUM; 62 BP.
The .dat file I'm using is the file which can be found here (human 3'UTR database). From what I can tell, it looks like it is formatted properly. Is there any modification I can make to the file so that it adheres with the standard expected by Biopython?