Question

Explanation of a Perl script

0

Entering edit mode

6.2 years ago

uuuiii647 • 0

Hi! :)

I'm new here and I'm also at my first approaches to Bioinformatics. I found this Perl script and I don't really understand all the passages, in particular in the end. Could someone explain me it, maybe telling me why programmers used these ways to write the script and not others?

I know maybe this is not a real good question but I'd really like to understand this script and I don't know how to do it...

Thank you very much and Ihope you could understand my English!

       $file=shift;

        print "Window length\n";
        $kmer=<STDIN>;
        chomp ($kmer);

         print "Minimum quality score cut-off\n";
         $cut_off=<STDIN>;
         chomp ($cut_off);

         open (MYFILE, ">R.txt");   

         if (open(FASTQ,$file)) 
          {
         while($header1=<FASTQ>)
             {      
           $dna=<FASTQ>; 
           $header2=<FASTQ>;
           $qual=<FASTQ>; 
           @dna=split ('', $dna);  
           @qual=split ('', $qual); 
           @num_value=();
           @scores=();

        foreach $qscore (@qual) 
        {
           $num_value=ord($qscore)-33;
           push(@num_value, $num_value);
        }


                foreach $value (@num_value)
                {
                    if ($value<$cut_off){   
                        pop(@qual);                                                 
                    }else{
                        last;
                    }   
                }
                $sub1=substr($dna,-$#qual);
                $sub11=substr($qual,-$#qual);

                @qscopy=reverse @num_value;
                foreach $value (@qscopy) 
                {
                    if ($value<$cut_off){
                        pop(@qual); 
                    }else{
                        last;                           
                    }
                }
                $sub2=substr($sub1,0,$#qual);
                $sub22=substr($sub11,0,$#qual);

                for ($i=0;$i<=$#qual-($kmer-1);$i++)
                {
                    @scores=@num_value[$i..$i+$kmer-1];

                    $sum=0;
                    foreach $score (@scores)
                    {
                        $sum+=$score;
                    }
                        if (($sum/$kmer)<$cut_off){
                            last;
                        }
                }                               
                $sub3=substr($sub2,0,($i-1));
                $sub33=substr($sub22,0,($i-1));

print MYFILE "$header1$sub3\n$header2$sub33\n\n";
}   
   }else{
print "Error!\n";
   }

  close MYFILE;

fastq • 1.6k views

ADD COMMENT • link updated 6.2 years ago by GenoMax 141k • written 6.2 years ago by uuuiii647 • 0

1

Entering edit mode

why programmers used these ways to write the script and not others?

For Perl, the reasons may vary: stylistic decision, personal preference, performance, ease of understand, or just because you can. One of Perl mottos is There's more than one way to do it.

ADD REPLY • link 6.2 years ago by h.mon 35k

1

Entering edit mode

me why programmers used these ways to write the script and not others?

I wonder the same. Why didn't they use Python? < /just kidding >

ADD REPLY • link 6.2 years ago by WouterDeCoster 47k

score 5 · Answer 1 · 2018-02-06

Hi! Since you're new to bioinformatics, I suppose this will be your first time of:

https://en.wikipedia.org/wiki/RTFM

Apart from the jokes, you can search for every function and command in the official perl documentation (http://perldoc.perl.org/) or googling the command and the word "perl" together with it.

You will most likely find another post here in Biostars or in Stack Overflow (https://stackoverflow.com/) where people already asked that, and most of the times got roasted for it :)

What you're asking is very general and not really related to bioinformatics, rather to perl itself. Things you can solve yourself searching for the answers on your own.

If, instead, you have a question about one command that is giving you trouble with biological data: this is the place!

print: http://perldoc.perl.org/functions/print.html
while: https://www.tutorialspoint.com/perl/perl_while_loop.htm
foreach: https://www.tutorialspoint.com/perl/perl_foreach_loop.htm
substr: http://perldoc.perl.org/functions/substr.html
chomp: https://perldoc.perl.org/functions/chomp.html
split: https://perldoc.perl.org/functions/split.html
pop: http://perldoc.perl.org/functions/pop.html
close: http://perldoc.perl.org/functions/close.html

You'll find nice and juicy tutorials at: