Parsing Non-Coding Region Around Protein Of Interest From Embl File
0
0
Entering edit mode
11.3 years ago
Pappu ★ 2.1k

I want to parse noncoding DNA sequences around a protein (P08707) from an embl file: http://www.ebi.ac.uk/ena/data/view/U32222&display=txt&expanded=true

I could grep '^ CDS' and then figure out the non coding regions and compare to the location of the target protein in python. I am wondering if there are any smarter way of doing it. Thanks.

python • 2.3k views
ADD COMMENT
0
Entering edit mode

If you are just pulling one sequence why not just copy and paste from the link? Just look for 22433..23011 in the genomic sequence. Or you could pull the fasta and use a subsequence program: http://code.google.com/p/biopieces/wiki/extract_seq

ADD REPLY

Login before adding your answer.

Traffic: 2859 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6