Question: Finding Gc And At Percentages In A Sequence
1
Divya70 wrote:

I have written this program to calculate GC and AT contents in a given sequence but am unable to ignore the case sensitiveness.

``````print"Enter the fasta file name \n";
\$a=<>;
chomp(\$a);

unless(open(FH,\$a)){
print"Error";
}

@a=<FH>;
close FH;
\$a=join('',@a);
\$a=~s /\s//g;
\$n=length(\$a);
print "\$n";
\$a1=0;
\$t=0;
\$g=0;
\$c=0;
\$a1=(\$a=~tr/A//ig);
\$t=(\$a=~tr/T//ig);
\$g=(\$a=~tr/G//ig);
\$c=(\$a=~tr/C//ig);
#\$e=\$n-(\$a+\$t+\$g+\$c);
print "The counts are\n";
print"A:\$a1 \n";
print"T:\$t \n";
print"G:\$g \n";
print"C:\$c \n";
#print"E:\$e \n";
\$op = ((\$a1+\$t)/\$n)*100;
\$op1 = ((\$g+\$c)/\$n)*100;
print " GC%= \$op\n";
print " AT% = \$op1";
``````
perl gc sequence • 3.5k views
modified 9.9 years ago • written 9.9 years ago by Divya70

this looks as if it works

then also error is coming as bareword found where operatoe is expected at ~tr/T... lines

3
biobot 0.0.77.a.10996.1k wrote:

There are syntax errors in the `tr` lines that prevent this working at all. Perhaps you mean, for example:

``````\$a =~ s/A//ig
``````

Your Fasta parser is broken because it reads the header line as is if it were sequence.

Also, if you are learning, you should put `use strict` at the top, until you understand when not to `use strict`.

2
Jorge Amigo12k wrote:

Isn't your `\$op` variable storing the AT content, and the `\$op1` storing the GC content? I think you are printing them reversely.

As a general advice, I would better choose the variable names, since `\$a` for the data variable is slightly confusing and forces you to use `\$a1` for the A count (looking to `\$c`, `\$g` and `\$t` that A count should be labeled as `\$a` in order to humanly read through the code without any hassle), and also using numbers on variable names is definitely not advisable. I would change it for a simpler `\$data`, and if case-sensitiveness matters I would lower-case or upper-case that `\$data` variable to forget about the case through the entire code (using `lc(\$data)` or `uc(\$data)` as desired), not having to specify the `i` option for the pattern matching.