Question

Double Restriction, Fragment Generation

0

Entering edit mode

13.1 years ago

capemaster • 0

Hi all, I'm trying to deal with the restriction module of BioPyhton because I need to create a report of the fragments generated by a double in silico digestion. Briefly, if I digest a .fasta sequence with let's say EcoRI and PstI, I would like to know how many Eco-Pst, Pst-Eco, Eco-Eco and Pst-Pst fragments are generated. Does anybody know how to do it with BioPython?

EDIT

>Sequence
CTAGCGTAATAATAGGTACACTGAATAGTAGTACAGTACGTACAGCTTTTCCTGGGGATC
CTATCGCAATCGCGAATGCGACTTCACGTGAATAGATCTCATTCTGAGCTCCCTTATACG
TTATAGTTCGACTGTGCTTGATACAAAACGTTTTACTGACTATAACGTGGGGGCACGGGA
ATTCAACAGAACTCTCCAAGCTGTCGATTTCTGTATGTTTGAGATTAGATCAGACCTCAC
AAGACTCCCTAAACCATCCAGCCCACTTTATATCCCCTCTTCTCCGCCGGAGGTGAATTC
AATCCGGCACCAAGGGACTGACAATTTAGCGCAGATACGAGGCAGAACACCGGAAAGACC
AGCGGCACTCGCGGGGATCTGGCCCGGTGGGCCCCGGTCCGTGAGCCCGAAGACCCCCTC
CCCGAAGATTGGAGGTGCCAGGCAACTGAGGGAGGTGGCTGTCGACGCGCGCCCGGTGCC
CGGCCGAGATGTGGGGCCTCCCGGACGGGTCGACCAGCAGCCGGCCGGTGCCCCCTCCGT

In this sequence there are 2 EcoRI sites and 4 TaqI sites. I want to know how many fragments Eco-Taq, Eco-Eco, Taq-Eco or Taq-Taq are made by the digestion.

The answer from ALchEmiXt is something reasonable but I can't undestand the subtraction step: It could be that there are many sites of one enzyme, contigous... how can I deal with that?

Tank you for your help.

biopython • 3.2k views

ADD COMMENT • link updated 9.3 years ago by Biostar 20 • written 13.1 years ago by capemaster • 0

0

Entering edit mode

this all depends on what your data currently looks like. Edit your question and post a small snippet of the data that you are trying to process.

ADD REPLY • link 13.1 years ago by Istvan Albert 102k

0

Entering edit mode

Basically you generate a list sorted on cut position. To get the fragments just subtract item1 from item2, item2 from item3, and so forth... Since you have different enzymes you need to track which enzyme generated the cut...therefore the use of dictionaries (python I believe) or hashes/arrays....what you prefer, to keep de cut position associated with the enzyme.

ADD REPLY • link 13.1 years ago by ALchEmiXt ★ 1.9k

score 0 · Answer 1 · 2012-06-15

You can use BioPython or I used BioPerl in the past...which is good practise without reinventing the wheel.

Alternatively you could search using a regular expression in Perl (or Python) for Eco and store Eco and position in a hash. Do the same for PST and store them also in the hash. Sort the hash on position. Readout the fragments and determine the lengths of each by subtracting the two positions.

Have phun!