Fasta Output Python Parsing
2
0
Entering edit mode
12.0 years ago
Richard ▴ 590

Hi all, I was about to write this myself, but I thought I would check with the experts first.

Does anyone know of a python parser for FASTA output? ie. when I use the FASTA35.exe aligner there are a number of output format options (via -m). Is there existing code to slurp any of the output formats without having to write my own?

fasta python parsing • 2.7k views
ADD COMMENT
3
Entering edit mode
12.0 years ago
Neilfws 49k

The BioPython Bio.AlignIO module will read FASTA output generated using the -m 10 option.

ADD COMMENT
0
Entering edit mode

Great. Do you know of any online examples of how to pull out the alignment scores?

ADD REPLY
0
Entering edit mode
12.0 years ago
jingtao09 ▴ 110
For the fasta parser I wrote a code in python took the idea from Pierre Lindenbaum  in his post  
http://www.biostars.org/post/show/19426/counting-ns-within-fasta/#19439

def fastaio(fh):
        """
        it can take any file handler as input
       eg. fh=open(filename) , sys.stdin  , import gzip; fh=gzip.open(filename)
        """
        buff=[]
        header=[]
        while 1:
                c=fh.read(1)
                if not c or c==">":
                        if len(header)!=0:
                                yield (''.join(header) ,  ''.join(buff))
                        if not c: break
                        header[:]=[] ; buff[:]=[]
                        while 1:
                                hc=fh.read(1)
                                if hc=='\n':break
                                header.append(hc)
                else:
                        buff.append(c)

fh=open("myfile.fasta") for name, seq in fastaio(fh): print ">"+name print seq.replace("\n\s\r","")

ADD COMMENT

Login before adding your answer.

Traffic: 2849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6