Find A Substring And Count The Number Of Occurrences
4
0
Entering edit mode
10.7 years ago
Elena ▴ 240

The program should find the substrings in a given string and count their number of occurrences. Moreover the substring must check the occurrences only for every three alphabets.

for eg: String: AGAUUUAGA

output:

AGA-2 UUU-1

print"Enter the mRNA Sequence\n";
$count = 0;$count1 = 0;
$seq = <>; chomp($seq);
$p = '';$ln = length($seq);$j = $ln/3; for($i=0,$k=0;$i<$ln,$k<$j;$k++)
{
$fra[$k]=substr($seq,$i,3);
$i=$i+3;

if({$fra[$k]} eq AGA)
{
$count++; print"The number of AGA is$count";
}
elseif({$fra[$k]} eq UUU)
{
$count1++; print" The number of UUU is$count1";
}

}

perl homework • 5.9k views
1
Entering edit mode

downvote - it's tagged as homework and not a question.

1
Entering edit mode

@Casbon Please note that the tags can be edited by someone else than the OP, as is the case here. The homework tag has been added by another user with admin rights. I agree that Payaliya should make a greater effort when posting questions.

0
Entering edit mode

which error are you getting? What is the expected output?

0
Entering edit mode

I have reformatted the code to make it easier to read on this website. Please use the "Code block" button when posting code. Moreover, I've improved the spelling of your question and the title, check whether it is ok.

0
Entering edit mode

What is the question?

0
Entering edit mode

Payaliya please stop editing the title of the question when you think it should be [closed] - questions are not 'closed' just because there is a single acceptable answer, they are closed for being off-topic or duplicated, or not a real question!

5
Entering edit mode
10.7 years ago

Hi.

Here your version of the code corrected so that it works

print"Enter the mRNA Sequence\n";
$count = 0;$count1 = 0;
$seq = <>; chomp($seq);
$p = '';$ln = length($seq);$j = $ln/3; for($i=0,$k=0;$i<$ln,$k<$j;$k++)
{
$fra[$k]=substr($seq,$i,3);
$i=$i+3;
print "$k -> @fra\n"; print "$k -> {$fra[$k]}\n";
if($fra[$k] eq 'AGA')
{
$count++; print"The number of AGA is$count\n";
}
elsif($fra[$k] eq 'UUU')
{
$count1++; print" The number of UUU is$count1\n";
}
}


With some printing here and there to show you what is going on.

The main problem was that you compare {$fra[$k]} with UUU, but {$fra[$k]} is {UUU} (including braces)

Here a shorter version, possibly easier to read

use strict;
print"Enter the mRNA Sequence\n";

my $seq = <>; chomp($seq);
my %freq;

my @codons = split(/(\w{3})/, $seq); foreach my$thisCodon (@codons) {
$freq{$thisCodon}++;
}

print "AGA found $freq{'AGA'} times\n"; print "UUU found$freq{'UUU'} times\n";

1
Entering edit mode
10.7 years ago

Hi there,

Bioperl has the Bio::Tools::SeqWords module which already implements a solution to this problem for an arbitrary word size. It can handle overlapping words too. Just check the code to see how it works.

0
Entering edit mode
10.7 years ago
Phis ★ 1.1k

The program should find the substrings in a given string and count their number of occurrences.

I don't know whether that's relevant to you at all, but if this is for speed/efficiency and/or must handle lots of data and/or you need repeated lookups, I'd recommend using a suffix array approach to substring searching. There is obviously a cost to building the suffix array in the first place, but once you have it, locating substrings is very fast.

0
Entering edit mode
8.0 years ago
Woa ★ 2.9k

This includes overlapping codons:

use strict;
use warnings;
my $string="AGAUUUAGA"; my @trinucs=($string=~/(?=(.{3}))/g);
my %tri_count=();
$tri_count{$_}++ for @trinucs;
print $_,":",$tri_count{\$_},"\n" for sort keys(%tri_count);


Output is:

AGA:2
AUU:1
GAU:1
UAG:1
UUA:1
UUU:1