I am trying to run a model in PAML that requires me to combine records from multiple fasta files. I have 6 FASTA files, each with the same number of records. What I want to do is interleave the records into a single file such that my result file has:
>record1_fileOne
AAA
>record1_fileTwo
AAA
>record1_fileThree
AAA
>record1_fileFour
AAA
>record1_fileFive
AAA
>record1_fileSix
AAA
>record2>fileOne
GGG
I wrote the code below which just concatenates the fasta records, not interleaving them. I think there is probably some sort of trick I can use using python itertools that I'm just not seeing. Can anyone point me in the right direction?
I found this script that interleaves 2 fasta files, but I need to extend it to N fasta files:
def read_fasta(fh):
    """ generator for reading a fasta record: taken from [http://stackoverflow.com/a/7655072/1735942][2] """
    name, seq = None, []
    for line in fh:
        line = line.rstrip()
        if line.startswith(">"):
            if name: yield (name, ''.join(seq))
                name, seq = line, []
            else:
                seq.append(line)
        if name: yield (name, ''.join(seq))
fastafiles=args[0:]
filehandles=list(itertools.imap(open, fastafiles)) #list of filehandles for fasta files 
for fh in filehandles:
    for id,seq in read_fasta(fh):
        print id
        print seq
There's no need to write any code to group fasta records, as that's one well-implemented function of Biopython.
well, if you need Biopython anyway then go for it, but if you just want to parse these files without introducing an additional dependency, then that's a great solution!
finally, somebody who understands Python concepts and is not just offering 'same-in-every-language' solutions. +1 on this