Question: How to read SwissProt file with regexps in Python
0
gravatar for natasha.sernova
2.3 years ago by
natasha.sernova3.4k
natasha.sernova3.4k wrote:

Dear all,

I've read the file with startswith and Biopython.

I don't need the whole file, only ID, AC, OC, KW, SQ-lines and a sequence itself.

But I was told I have to do it with regexps. I've spent a few days on it, and I see I cannot do it.

Please help me!

Many thanks!

Natasha

# startswith

import re
import random
import math
import sys
print "This is the name of the script: ", sys.argv[0]
print "Number of arguments: ", len(sys.argv)
print "The arguments are: " , str(sys.argv)

fin = open(sys.argv[1], 'r')
for line in fin:
    if line.startswith("AC"):
       print line
    elif line.startswith("DE"):  
      print line
    elif line.startswith("OC"):  
      print line
    elif line.startswith("KW"):  
      print line
    elif line.startswith("SQ"): 
        AA=list()
        AA = line.split()
        print "Seq_Length = "+AA[2]+AA[3]

    elif line.startswith("\/\/"):  
        break
fin.close()

import urllib
import re

#Biopython

from Bio import ExPASy
from Bio import SeqIO
handle = ExPASy.get_sprot_raw("P35579")
seq_record = SeqIO.read(handle, "swiss")
handle.close()
printseq_record.id)
printseq_record.name)
print(seq_record.description)
print(repr(seq_record.seq))
print("Length %i" % len(seq_record))
print(seq_record.annotations["keywords"])

fhand = urllib.urlopen('http://www.uniprot.org/uniprot/P35579.fasta')
for line in fhand:
    print re.sub(r'$[\n]','', line)    
#    print line
#    print re.sub(r'[\.]','!', line)
regexp python • 857 views
ADD COMMENTlink modified 2.3 years ago by WouterDeCoster37k • written 2.3 years ago by natasha.sernova3.4k

I've found this refence, but exactly these solutions I've tried for a long time and failed.

http://stackoverflow.com/questions/6186938/python-how-to-use-regexp-on-file-line-by-line-in-python

ADD REPLYlink written 2.3 years ago by natasha.sernova3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1444 users visited in the last hour