I'd like to write a script in python that, given a set of RNA sequences, finds the ones with the strongest predicted folding. A quick and dirty way like folding them and picking the ones with a - delta G below a set threshold would suffice in my case.
I was looking at two "classical" folding packages (mfold and RNAfold from the Vienna package), that I can run on command line from python. However, neither of them has an options in which just the delta G is output. They produce a series or files for each RNA, and it would be complicated to dive through each one just to extract the delta G value. RNA fold has an output that would fit better my needs, but is still not the ideal.
Does anyone have any suggestion on how to proceed?
EDIT: I'll add some examples, as some people suggested. I'm a bit of a python novice, so my apologies if I am missing any obvious ways to solve my issue.
the best I managed to get is through RNAfold. The output looks like this, repeated for each sequence:
>seq1 AUGGCUGUUCGCCAUUUAAAG (((((.....)))))...... ( -5.20)
It's not too bad, and I could extract the delta G value (-5.20 here) by telling python to extract the value between the brackets. However, I'm not sure that's the most elegant way, and I was wondering if anyone knew of packages that output a tab delimited file, or of ways to have RNAfold (or mfold) do that.
Another "problem" is that both mfold and RNAfold also produce one file with an image of the folding for each sequence (which I don't need), so then I would have to delete them via python, and again I'm not sure this is ideal.