Question: Explanation of a Perl script
0
gravatar for uuuiii647
14 months ago by
uuuiii6470
uuuiii6470 wrote:

Hi! :)

I'm new here and I'm also at my first approaches to Bioinformatics. I found this Perl script and I don't really understand all the passages, in particular in the end. Could someone explain me it, maybe telling me why programmers used these ways to write the script and not others?

I know maybe this is not a real good question but I'd really like to understand this script and I don't know how to do it...

Thank you very much and Ihope you could understand my English!

       $file=shift;

        print "Window length\n";
        $kmer=<STDIN>;
        chomp ($kmer);

         print "Minimum quality score cut-off\n";
         $cut_off=<STDIN>;
         chomp ($cut_off);

         open (MYFILE, ">R.txt");   

         if (open(FASTQ,$file)) 
          {
         while($header1=<FASTQ>)
             {      
           $dna=<FASTQ>; 
           $header2=<FASTQ>;
           $qual=<FASTQ>; 
           @dna=split ('', $dna);  
           @qual=split ('', $qual); 
           @num_value=();
           @scores=();

        foreach $qscore (@qual) 
        {
           $num_value=ord($qscore)-33;
           push(@num_value, $num_value);
        }


                foreach $value (@num_value)
                {
                    if ($value<$cut_off){   
                        pop(@qual);                                                 
                    }else{
                        last;
                    }   
                }
                $sub1=substr($dna,-$#qual);
                $sub11=substr($qual,-$#qual);

                @qscopy=reverse @num_value;
                foreach $value (@qscopy) 
                {
                    if ($value<$cut_off){
                        pop(@qual); 
                    }else{
                        last;                           
                    }
                }
                $sub2=substr($sub1,0,$#qual);
                $sub22=substr($sub11,0,$#qual);

                for ($i=0;$i<=$#qual-($kmer-1);$i++)
                {
                    @scores=@num_value[$i..$i+$kmer-1];

                    $sum=0;
                    foreach $score (@scores)
                    {
                        $sum+=$score;
                    }
                        if (($sum/$kmer)<$cut_off){
                            last;
                        }
                }                               
                $sub3=substr($sub2,0,($i-1));
                $sub33=substr($sub22,0,($i-1));

print MYFILE "$header1$sub3\n$header2$sub33\n\n";
}   
   }else{
print "Error!\n";
   }

  close MYFILE;
fastq • 528 views
ADD COMMENTlink modified 14 months ago by genomax65k • written 14 months ago by uuuiii6470
1

why programmers used these ways to write the script and not others?

For Perl, the reasons may vary: stylistic decision, personal preference, performance, ease of understand, or just because you can. One of Perl mottos is There's more than one way to do it.

ADD REPLYlink modified 14 months ago • written 14 months ago by h.mon24k
1

me why programmers used these ways to write the script and not others?

I wonder the same. Why didn't they use Python? < /just kidding >

ADD REPLYlink modified 14 months ago • written 14 months ago by WouterDeCoster38k
5
gravatar for Macspider
14 months ago by
Macspider2.8k
Vienna - BOKU
Macspider2.8k wrote:

Hi! Since you're new to bioinformatics, I suppose this will be your first time of:

https://en.wikipedia.org/wiki/RTFM

Apart from the jokes, you can search for every function and command in the official perl documentation (http://perldoc.perl.org/) or googling the command and the word "perl" together with it.

You will most likely find another post here in Biostars or in Stack Overflow (https://stackoverflow.com/) where people already asked that, and most of the times got roasted for it :)

What you're asking is very general and not really related to bioinformatics, rather to perl itself. Things you can solve yourself searching for the answers on your own.

If, instead, you have a question about one command that is giving you trouble with biological data: this is the place!

You'll find nice and juicy tutorials at:

ADD COMMENTlink modified 14 months ago • written 14 months ago by Macspider2.8k

And the other thing the OP will need to know is the structure and contents of a FastQ file (Description here).

ADD REPLYlink written 14 months ago by Dan Gaston7.1k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2299 users visited in the last hour