Question: Translate DNA sequence in to 6 reading frames (forward& reverse)
0
gravatar for tcf.hcdg
3.5 years ago by
tcf.hcdg60
European Union
tcf.hcdg60 wrote:

I am new in perl and haven't enough information for this language. Appology if there are mistakes in the code. I have gone through biostar question[1] and other pages on internet to translate DNA sequences in to 6 reading frames. what I tried is

usr/bin/perl 
use strict;
use warnings;

local $/ = "\r\n";  
my @frames = (1, 2, 3, 4, 5, 6); 

foreach my $frame (@frames) {
    frame ($frame);

    sub frame 
        {
        my $f = shift;
        open  "3mo_m10_clean_1_paired_assembly_linear_seq.fasta", $ARGV[0];

        while ("3mo_m10_clean_1_paired_assembly_linear_seq.fasta"){
        chomp;
            my $m = length ($_);

            if (/^>(\w+)/){
                print ">$1"."_"."frame"."$f"."\n";
                }elsif($f > 0){
                    my $frameseq = substr($_, $f-1, $m-$f+1);
                    print "$frameseq\n";
                    }elsif ($f<0){
                        my $comprevseq = reverse $_;
                        $comprevseq = ~ tr/[A,T,C,G,a,t,c,g,]/ [T,A,G,C,t,a,g,c]/; 
                        my $frameseq = substr ($comprevseq, abs($f)-1, $m-abs($f)+1);

                        print "$frameseq\n";
                        }
                                    }
        }
close "3mo_m10_clean_1_paired_assembly_linear_seq.fasta";

}

I got the following error and don't know hot to fix it.

Use of uninitialized value $_ in scalar chomp at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 18.
 Use of uninitialized value $_ in pattern match (m//) at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 21.
 Use of uninitialized value $m in subtraction (-) at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 24.
 Use of uninitialized value $_ in substr at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 24.
dna translation perl • 1.6k views
ADD COMMENTlink modified 3.5 years ago by SES8.2k • written 3.5 years ago by tcf.hcdg60
2
gravatar for SES
3.5 years ago by
SES8.2k
Vancouver, BC
SES8.2k wrote:

There are a number of typos/errors in the code but I'll just mention the major problems. The main issue is that you are not using open correctly. Generally speaking, to read a file you open a filehandle and read from it, then you close the filehandle (not the file). Most importantly, you need to specify the mode and use a lexical filehandle (I'll explain why). It is also good to test that these tasks succeeded. Here's how you open a file:

use strict;
use warnings;
use autodie;

open my $fh, '<', $file;

The autodie pragma (not required) allows you avoid typing "or die .." everywhere and the other pragmas are best practices (they are implicit in new Perl versions, strictures anyway) because we all make mistakes and Perl doesn't care. Then you read a line as Emily suggested (while (<$fh>) { ... }) and later close the filehandle:

close $fh;

If you forget to close the filehandle or try to declare the same filehandle later, no problem. Perl will halt and tell you that's not allowed. If you do this for reading a file:

open FH, $file;

and later do this print FH "some $data" anywhere in the code, your data is gone! Okay, technically Perl won't let you do that if you opened a file for reading (though I believe it once was possible), but even if you specify the mode you can still use that bare filehandle from anywhere because it has global scope. So, there is no reason to ever use a bare filehandle.

The other issue is that you are declaring a subroutine in a loop. This doesn't need to be seen before it is used so the common practice is to put that at the end of the file.

Unless this is an assignment, I would use a toolkit for translating the sequences so you can keep learning and practicing coding with smaller problems. Here is an example using Perl and EMBOSS:

use strict;
use warnings;
use Bio::Factory::EMBOSS;

my $usage   = "$0 infile outfile";
my $infile  = shift or die $usage;
my $outfile = shift or die $usage;

my $factory = Bio::Factory::EMBOSS->new;
my $sixpack = $factory->program('sixpack'); 

$sixpack->run({-sequence => $infile, -outseq => $outfile });

You can extend that code with what I mentioned above by opening the output and looking at the results, or whatever you need to do. If you want to implement your own solution, I would recommend looking at the BioPerl code since it has been thoroughly tested.

ADD COMMENTlink written 3.5 years ago by SES8.2k
1
gravatar for Emily_Ensembl
3.5 years ago by
Emily_Ensembl19k
EMBL-EBI
Emily_Ensembl19k wrote:

There's something up with your while loop. It can't find the scalar $_ because the while isn't working right. I would usually open my file by giving it a file handle eg:

open (INPUT,  "3mo_m10_clean_1_paired_assembly_linear_seq.fasta"), $ARGV[0];

Then define my while as:

while (<INPUT>){ #do stuff }

That will move it through line by line. I've never tried putting the filename directly into the loop – presumably you can but you'd still need to put inside < and > to get it to go through line by line.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Emily_Ensembl19k

I tried as you mentioned using file handle but it is giving following error:

Name "main::INPUT" used only once: possible typo at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 15.
Unknown open() mode '3mo_m10_clean_1_paired_assembly_linear_seq.fasta' at 3mo_10_clean_1_paire_assembly_linear_seq_6frame.pl line 15.
ADD REPLYlink written 3.5 years ago by tcf.hcdg60

I made an error in my code. Fixed it in the original post

ADD REPLYlink written 3.5 years ago by Emily_Ensembl19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1963 users visited in the last hour