Question: Insillico dual restriction enzyme reference genome digestion.
1
gravatar for William
2.3 years ago by
William4.3k
Europe
William4.3k wrote:

I am looking for a tool that can do a insillico digestion of a reference genome given 2 restriction enzymes (or the pattern at which both cut).

The output that I would like to have is a BED file with the start and end position on the reference genome of all the produced RE DNA fragments. (or only those that are larger than a specific lenght).

The tool should take into account that both the forward and reverse DNA strand could be cut.

Is there any such tool that I can download?

insillico ddradseq • 793 views
ADD COMMENTlink modified 4 months ago by chahat_u70 • written 2.3 years ago by William4.3k

You can try restrict from EMBOSS. You would need to make the BED format file from the output of restrict.

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by genomax57k

duplicate of Genomic Restriction Finder

ADD REPLYlink written 2.3 years ago by Pierre Lindenbaum113k
1
gravatar for William
2.3 years ago by
William4.3k
Europe
William4.3k wrote:

I found the R package SimRad which

provides a number functions to simulate restriction enzyme digestion, library construction and fragments size selection.

https://cran.r-project.org/web/packages/SimRAD/SimRAD.pdf

A nice feature is that a filter step can also be done on restriction site combination to only select fragments that start with enzyme A restriction site and end with enzyme B restriction site.

adapt.select(sequences, type = "AB+BA", cut_site_5prime1, cut_site_3prime1,
cut_site_5prime2, cut_site_3prime2)

The output is the DNA fragments produced by the digestion and size selection.

The output does not contain were the fragments are located on the genome.

To create the BED file I need to BLAST the fragments versus the reference genome I guess.

ADD COMMENTlink written 2.3 years ago by William4.3k
2

In this case, Blast will be complete overkill. All your fragments are EXACT substrings of your ref sequence. Thus you can find their location with any scripting language which offers a string.find() method.

ADD REPLYlink written 2.3 years ago by piet1.5k
0
gravatar for dariober
2.3 years ago by
dariober9.5k
Glasgow - UK
dariober9.5k wrote:

This is a hack, but it might work. I wrote a script to find regular expressions in fasta files here fastaRegexFinder.

To find all the restriction fragments from two enzymes, run fastaRegexFinder.py for each combination of enzyme 1 and enzyme 2 in forward and reverse (= 16 times). E.g. say restriction sites are TTCC and ACTG, do:

res="TTCC GGAA ACTG CAGT"
for r1 in $res
do
    for r2 in $res
    do
    echo "$r1 and $r2"
    fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r $r1.*?(?=$r2)
    done
done

This compiles and runs the following commands:

fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r TTCC.*?(?=TTCC)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r TTCC.*?(?=GGAA)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r TTCC.*?(?=ACTG)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r TTCC.*?(?=CAGT)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r GGAA.*?(?=TTCC)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r GGAA.*?(?=GGAA)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r GGAA.*?(?=ACTG)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r GGAA.*?(?=CAGT)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r ACTG.*?(?=TTCC)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r ACTG.*?(?=GGAA)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r ACTG.*?(?=ACTG)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r ACTG.*?(?=CAGT)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r CAGT.*?(?=TTCC)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r CAGT.*?(?=GGAA)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r CAGT.*?(?=ACTG)
fastaRegexFinder.py -q --noreverse -f test_data/synth.fa -r CAGT.*?(?=CAGT)

The output (to stdout by default) will be in bed format.

ADD COMMENTlink written 2.3 years ago by dariober9.5k
0
gravatar for chahat_u
4 months ago by
chahat_u70
United States
chahat_u70 wrote:

Another good option is a utility called digest_genome(.py) that comes with HicPro. It can digest the reference genome by the provided restriction enzymes(s) and generate a BED file with the list of restriction fragments after digestion. You can specify multiple restriction enzymes too. here is the link

ADD COMMENTlink written 4 months ago by chahat_u70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1151 users visited in the last hour