I downloaded and extracted the Pfam-A.full.ncbi.gz from ftp://ftp.ebi.ac.uk/pub/databases/Pfam/releases/Pfam31.0/
I splittet the huge file into smaller files for each family like: https://www.dropbox.com/sh/t0b173oa8odvsne/AABZLhiaq-jZtE5PrULvBm9na?dl=0
Now i try to extract all Sequences from the Alignment, which have an observed/cleared structures in the pdb database. Luckily Pfam provides the pdbmap file which, links pdb IDs to Pfam IDs. I can extract all Pfam families, which have observed structures in the pdb, but i can not extract the corresponding sequences. So any help on how to accomplish this is very appreciated.
PS: I currently use Python 2.7 with BioPython, but it can't handle the files.