How I Get The Fasta Sequences Of Proteins From A List Of Protien Pdb Id
        2 
    
    
    
        
        
        
        
            
                
                
                    
                        
                    
                
                    
                        If I have a list of pdb id of protein along with the beginning and end of sequences I am interested in, is there a API from pymol or other place I could get a file listing all the fasta sequence of these proteins (if possible in the region i am interested in?)
Thanks!
                    
                 
                 
                
                
                    
                    
    
        
        
            pdb
         
        
    
        
        
            api
         
        
    
        
        
            fasta
         
        
    
    
        • 7.0k views
    
 
                
                 
                
                
    
    • 
link 
    
    
    
    
    
    
        
    
        updated 4.0 years ago by
        
            Ram 
         
        
    
         
    
    45k
        •
    
        written 12.8 years ago by
        
            heath 
         
        
    
        ▴
    
    20
     
 
 
             
            
            
         
     
 
     
    
        
            
                
 
    
    
    
    
        
        
        
        
            
                
                
                    
                        
                    
                
                    
                        The following command seems to work:
$ echo -e "3I5F\n2p4k\n2p4m" | \
  while read I; do curl -s "http://www.rcsb.org/pdb/rest/customReport?pdbids=${I}&customReportColumns=structureId,chainId,entityId,sequence,db_id,db_name&service=wsdisplay&format=text" | \
  xsltproc stylesheet.xsl - ; done | \
  fold -w 80
 
with stylesheet.xsl:
output:
>3I5F|A|1|O44934|O44934
MTMDFSDPDMEFLCLTRQKLMEATSIPFDGKKNCWVPDPDFGFVGAEIQSTKGDEVTVKTDKTQETRVVKKDDIGQRNPP
KFEMNMDMANLTFLNEASILHNLRSRYESGFIYTYSGLFCIAINPYRRLPIYTQGLVDKYRGKRRAEMPPHLFSIADNAY
QYMLQDRENQSMLITGESGAGKTENTKKVIQYFALVAASLAGKKDKKEEEKKKDEKKGTLEDQIVQCNPVLEAYGNAKTT
RNNNSSRFGKFIRIHFGTQGKIAGADIETYLLEKSRVTYQQSAERNYHIFYQLLSPAFPENIEKILAVPDPGLYGFINQG
TLTVDGIDDEEEMGLTDTAFDVLGFTDEEKLSMYKCTGCILHLGEMKWKQRGEQAEADGTAEAEKVAFLLGVNAGDLLKC
LLKPKIKVGTEYVTQGRNKDQVTNSIAALAKSLYDRMFNWLVRRVNQTLDTKAKRQFFIGVLDIAGFEIFDFNSFEQLCI
NYTNERLQQFFNHHMFVLEQEEYKKEGIVWEFIDFGLDLQACIELIEKPMGILSILEEECMFPKASDTSFKNKLYDNHLG
KNPMFGKPKPPKAGCAEAHFCLHHYAGSVSYSIAGWLDKNKDPINENVVELLQNSKEPIVKMLFTPPRILTPGGKKKKGK
SAAFQTISSVHKESLNKLMKNLYSTHPHFVRCIIPNELKTPGLIDAALVLHQLRCNGVLEGIRICRKGFPNRIIYSEFKQ
RYSILAPNAVPSGFADGKVVTDKALSALQLDPNEYRLGNTKVFFKAGVLGMLEDMRDERLSKIISMFQAHIRGYLMRKAY
KKLQDQRIGLTLIQRNVRKWLVLRNWEWWRLFNKVKPLL
>3I5F|B|2|P08052|P08052
AEEAPRRVKLSQRQMQELKEAFTMIDQDRDGFIGMEDLKDMFSSLGRVPPDDELNAMLKECPGQLNFTAFLTLFGEKVSG
TDPEDALRNAFSMFDEDGQGFIPEDYLKDLLENMGDNFSKEEIKNVWKDAPLKNKQFNYNKMVDIKGKAEDED
>3I5F|C|3|P05945|P05945
SQLTKDEIEEVREVFDLFDFWDGRDGDVDAAKVGDLLRCLGMNPTEAQVHQHGGTKKMGEKAYKLEEILPIYEEMSSKDT
GTAADEFMEAFKTFDREGQGLISSAEIRNVLKMLGERITEDQCNDIFTFCDIREDIDGNIKYEDLMKKVMAGPFPDKSD
>2P4K|A|1|P04179|P04179
KHSLPDLPYDYGALEPHINAQIMQLHHSKHHAANVNNLNVTEEKYQEALAKGDVTAQIALQPALKFNGGGHINHSIFWTN
LSPNGGGEPKGELLEAIKRDFGSFDKFKEKLTAASVGVQGSGWGWLGFNKERGHLQIAACPNQDPLQGTTGLIPLLGIDV
WEHAYYLQYKNVRPDYLKAIWNVINWENVTERYMACKK
>2P4K|B|1|P04179|P04179
KHSLPDLPYDYGALEPHINAQIMQLHHSKHHAANVNNLNVTEEKYQEALAKGDVTAQIALQPALKFNGGGHINHSIFWTN
LSPNGGGEPKGELLEAIKRDFGSFDKFKEKLTAASVGVQGSGWGWLGFNKERGHLQIAACPNQDPLQGTTGLIPLLGIDV
WEHAYYLQYKNVRPDYLKAIWNVINWENVTERYMACKK
>2P4K|C|1|P04179|P04179
KHSLPDLPYDYGALEPHINAQIMQLHHSKHHAANVNNLNVTEEKYQEALAKGDVTAQIALQPALKFNGGGHINHSIFWTN
LSPNGGGEPKGELLEAIKRDFGSFDKFKEKLTAASVGVQGSGWGWLGFNKERGHLQIAACPNQDPLQGTTGLIPLLGIDV
WEHAYYLQYKNVRPDYLKAIWNVINWENVTERYMACKK
>2P4K|D|1|P04179|P04179
KHSLPDLPYDYGALEPHINAQIMQLHHSKHHAANVNNLNVTEEKYQEALAKGDVTAQIALQPALKFNGGGHINHSIFWTN
LSPNGGGEPKGELLEAIKRDFGSFDKFKEKLTAASVGVQGSGWGWLGFNKERGHLQIAACPNQDPLQGTTGLIPLLGIDV
WEHAYYLQYKNVRPDYLKAIWNVINWENVTERYMACKK
>2P4M|A|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|B|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|C|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|D|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|E|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|F|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|G|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
>2P4M|H|1|P83690|P83690
MSVIATQMTYKVYMSGTVNGHYFEVEGDGKGKPYEGEQTVKLTVTKGGPLPFAWDILSPQCQYGSIPFTKYPEDIPDYVK
QSFPEGFTWERIMNFEDGAVCTVSNDSSIQGNCFTYHVKFSGLNFPPNGPVMQKKTQGWEPSSERLFARGGMLIGNNFMA
LKLEGGGHYLCEFKTTYKAKKPVKMPGYHYVDRKLDVTNHNKDYTSVEQCEISIARKPVVA
 
                    
                 
                 
                
                
                 
                
                
 
             
            
            
         
     
 
         
        
            
                
 
    
    
    
    
        
        
        
        
            
                
                
                    
                        
                    
                
                    
                        Try pdb-tools  - there is a module included pdb_seq.py
                    
                 
                 
                
                
                 
                
                
 
             
            
            
         
     
 
         
        
 
    
    
        
            
                 Login  before adding your answer.
         
    
    
         
        
            
        
     
    
    Traffic: 5653 users visited in the last hour
         
    
    
        
    
    
 
Thanks a lot! ^-^