Question: Randomize CDS While Maintaining Amino Acid Sequence
0
gravatar for sheinsch
5.4 years ago by
sheinsch10
United States
sheinsch10 wrote:

I am trying to remove promoter recognition sites and transcription factor binding sites from several coding sequences. Is there a tool that will randomize the nucleotide sequence while maintaining the amino acid sequence?

EDIT:

I will be expressing the proteins under a variety of promoters. What I am trying to do now is remove any sites within the CDS that could potentially bind transcription factors or RNA polymerase. 

gene • 1.3k views
ADD COMMENTlink modified 5.4 years ago by Sam2.8k • written 5.4 years ago by sheinsch10

Why do you want to randomize the nucleotide sequences in the protein coding region when you want to remove the promoter recognition sites and TF binding sites which is usually located up-stream of the protein coding region?

ADD REPLYlink written 5.4 years ago by Sam2.8k
0
gravatar for Sam
5.4 years ago by
Sam2.8k
New York
Sam2.8k wrote:

I played around with perl and this should give you a randomized sequence each time:

#!/usr/bin/perl
use strict;
use warnings;
my $num_args = $#ARGV + 1;
if ($num_args != 2) {
    print "\nUsage: aminoRand.pl <Codon Table File> <nucleotide sequence>\n";
    exit;
}

open CODON, $ARGV[0] or die $!;

my %codon = ();
my %translate = ();
while (<CODON>) {
  chomp;
  if ( /^\s*$/ ) { 
  }else{
      my @list = split( /\s+/, $_);
      my $key = $list[0];
      my @codes = @list;
      @codes = splice @codes, 1, @codes;
      $codon{$key} = \@codes;
      for(my $i = 1; $i < $#list+1; ++$i){
        $translate{$list[$i]} = $key;
      }
  }
}
close(CODON);
chomp($ARGV[1]);
my $length = length($ARGV[1]);
for(my $i = 0; $i < $length; $i=$i+3){
    my $current = substr $ARGV[1], $i, 3;
    if(exists $translate{$current}){
        my $newKey=$translate{$current};
        if(exists $codon{$newKey}){
        my @possible = @{$codon{$newKey}};
        print($possible[rand @possible]);
        }
        else{
            print "Cannot find in codon: $newKey\n";
        }
    }
    else{
        print "Can't find: $current\n";
    }
}
print("\n");

You will need to provide a codon file of the following format:

I ATT  ATC  ATA      
L CTT  CTC  CTA  CTG  TTA  TTG

and then the sequence. Then it will randomly generate a sequence that will produce the same amino acid sequence but different neucleotide

ADD COMMENTlink written 5.4 years ago by Sam2.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 907 users visited in the last hour