I am working on a project using command line BLAT right now. I need to be able to take the output of the BLAT run, in any of the supported formats, and convert into a format that can be re-entered into a BLAT run. Eventually, my goal is to be able to iterate my BLAT runs. For reference BLAT can output psl, pslx, maf, sim4, axt, blast- tab, and blast-text format but takes as input only fasta, nib, and 2bit. I found a Biopython module called BlatIO (BlatIO on github.com) that supports parsing for .psl or .pslx files and attempted to parse this .psl output into a fasta format using my own code:
import sys sys.path.insert(1, 'C:\\Python27\Lib\site-packages\Bio\BlatIO.py') from Bio.AlignIO import BlatIO from Bio import SearchIO from Bio.SearchIO._model import QueryResult, Hit, HSP, HSPFragment alignments = SearchIO.parse(input_file, 'blat-psl', pslx=True) line1= QueryResult.id line2= HSPFragment.query print ('>', line1) print (line2)
The output is not an ID and a sequence like I would expect though. Instead I get this:
('>', property object at 0x029BC9F0) property object at 0x029BC3C0
I am open to all suggestions about how to get ANY of the BLAT output formats into ANY of the BLAT input formats....either through fixing the code I have started above or some other method.
(PS- I have already done this project in BLAST so please don't tell me to just use BLAST. I know that BLAST has different and in some ways better output formatting options, but I really need to use BLAT not BLAST. PPS - I am aware of tools like those as usaglaxay.com that convert files however I really need a code or package to do this, preferably in Python or Perl, and not a web browser tool!)