Parsing a GTF file with BCBio-gff: AttributeError
11 weeks ago
Flavia

I'm trying to parse a gtf file using this code:

from BCBio import GFF

gtf_rec = []
in_file = 'cuffcmp.combined.gtf'
out_file = 'extract.gtf'

with open(in_file) as f:
for line in f:
if 'class_code "x"' or 'class_code "u"' or 'class_code "i"' in line:
gtf_rec.append(line)

with open(out_file, "w") as out_handle:
GFF.write(gtf_rec, out_handle)

in_file.close()
out_handle.close()


When I print(gtf_rec), the required information is filtered out, but when I try to write then into a new file I get this AttributeError:

  File "/excise.py", line 18, in <module>
GFF.write(gtf, out_handle)
File "/GFFOutput.py", line 202, in write
return writer.write(recs, out_handle, include_fasta)
File "/GFFOutput.py", line 80, in write
self._write_rec(rec, out_handle)
File "/GFFOutput.py", line 108, in _write_rec
if len(rec.seq) > 0:
AttributeError: 'str' object has no attribute 'seq'


I'm new in bioinformatics, and I have spent to much time trying to solve this. The general explanation for this error can't help me to fix it.

Would like to know if some of you can find out the cause of the error or give me another tip to do this parsing.

There is extensive material to work with files other than gtf.

Thank you very much!

11 weeks ago

Your issue is that you parse your input manually into a custom array gtf_rec with items as strings. However, the function GFF.write expects input of the SeqRecord class. Instances of this class also have the required seq attribute.

Ideally, you should replace your custom input parsing with one that already makes use of the handy classes and functions provided by Biopython.

Oh, for sure, I already tried to find a specific SeqRecord limiter for attributes but I still couldn't.

But you gave me the error solution, your answer will certainly help me to get in something, thank you.

I think, GFF.parse(...,limit_info=) should do the trick to restrict the output to specific attributes. See section Limiting to features of interest in the tutorial.

