Question: How to use fastools digest function to cut the positive strand of DNA exclusively?
0
gravatar for bright602
4.4 years ago by
bright60240
bright60240 wrote:

I am currently dealing with some fasta files using fastools to digest the sequence by MboI restriction enzyme. The command I used is as follow fastools -digest -rest MboI -in A.fasta -out REA.fasta This enzyme cuts GATC sites, but the output also contain the complementary strand information which starts with CTAG. Therefore, I wanna ask whether it is possible to let the program only cut GATC instead of cutting CTAG? Or is there any other program to cut the DNA sequence in fasta file based on restriction site? Thanks!

sequence forum gene • 1.4k views
ADD COMMENTlink modified 4.4 years ago by piet1.7k • written 4.4 years ago by bright60240
1
gravatar for geek_y
4.4 years ago by
geek_y11k
Barcelona
geek_y11k wrote:
import re,sys
from Bio import SeqIO

def print_ReSites(id,seqence):
    pattern=r"GATC"
    seq_len=len(seqence)
    sites = [str(m.start()) for m in re.finditer(pattern,seqence)]
    sites.append(str(seq_len))
    for start,end in zip(sites,sites[1:]):
        print id+"\t"+start+"\t"+end

for seq in SeqIO.parse(sys.argv[1],"fasta"):
    print_ReSites( str( seq.id),str(seq.seq))
Usage: script.py in.fasta | bedtools getfasta -fi in.fasta -bed - -fo re_out.fasta

Example:

in.fasta

>1 
AGAGGAGGATCGAGGAGGTGATCGAGGATTTTGAGAGGAGGATCGAGGAGGTGATCGAGGATTTTG
>2 
GAGGGGGCTGGCGGCGGGATCGGAGGGGatttaggaGATCgaggattg

re_out.fasta

>1:7-19
GATCGAGGAGGT
>1:19-40
GATCGAGGATTTTGAGAGGAG
>1:40-52
GATCGAGGAGGT
>1:52-66
GATCGAGGATTTTG
>2:17-36
GATCGGAGGGGatttagga
>2:36-48
GATCgaggattg
ADD COMMENTlink modified 4.3 years ago by RamRS27k • written 4.4 years ago by geek_y11k

Hi Goutham,

Thanks for your script. I am trying the script while have the syntax error as follow, could you help me how to fix it?

python script.py in.fasta | bedtools getfasta -fi in.fasta -bed - -fo re_out.fasta index file in.fasta.fai not found, generating... File "script.py", line 13 print_ReSites( strseq.id),str(seq.seq)) ^ SyntaxError: invalid syntax

ADD REPLYlink written 4.4 years ago by bright60240

Thank you so much!! It works!

ADD REPLYlink written 4.4 years ago by bright60240

Hi Goutham,

I am also using your script with my own sequences but I have the same error. It seems is because of the extra " ) " at the end of the code. Then I deleted it, and know I have another problem. Could you help me to fix it please? The error:

print_ReSites( strseq.id),str(seq.seq)
NameError: name 'strseq' is not defined
ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by Nicolas Lichilin20
1

There is a problem with biostars formatting. This should help.

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by geek_y11k
1

I've added spaces to circumvent that. It is a weird bug. Here's what I tried to pin down the root cause:

  1. Removed the s before the ( (from print_Resites) to check if s( is the problem
  2. Moved the 4-spaces formatted code to back-tick formatted code
  3. Substituted the _ in the method name to __
  4. Used backslash to escape the parentheses (in which case it prints with the backslashes in)
  5. Tried the backticks-surrounding-the-problem-element method that seems to solve similar problems encountered in URLs with parentheses inside markdown.

None of those worked. Maybe Istvan Albert can shed some light - it seems to be a problem with the wmd markdown library.

ADD REPLYlink modified 4.3 years ago • written 4.3 years ago by RamRS27k

It works!!! Thanks!!

ADD REPLYlink written 4.3 years ago by Nicolas Lichilin20
0
gravatar for piet
4.4 years ago by
piet1.7k
planet earth
piet1.7k wrote:

MboI is a bacterial restriction endonuclease which cleaves double stranded DNA at the palindromic sequence site GATC.

fastools is a computer programm which may be used to identify restriction enzyme cleavage sites in double standed DNA. For your convenience, you feed only the sequence of one strand into this program, and fastools will recognize all cleavage sites on that strand. CTAG is a valid cleavage site for MboI.

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by piet1.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1284 users visited in the last hour