Counting Gaps In A Column Of Alignment
1
1
Entering edit mode
10.8 years ago
sksweety24 ▴ 10

Hi ,

I am a new bioperl user. I made a script to count the nucleotide composition in each column of the alignment. I am able to get the count for the nucleotides, but slice doesn't allow the count of gaps as it seems to exclude them. I would greatly appreciate any suggestions to solve this.

#!/bin/perl -w
use strict;
use warnings;

use List::Util 'max';
use Bio::SimpleAlign; 
use Bio::Align::AlignI;
use Bio::AlignIO;
use Bio::SeqIO;

my $in= Bio::AlignIO->new( -file => "seq.fst", -format => "fasta");

my $align = $in->next_aln();

print "column\tA's\tT'st\C's\tG's\n";

for (my $i = 1; $i <= $align->length; $i++) {
    my %count;
    my $seqs = $align->slice($i,$i);
    my $gap_char = $seqs->gap_char();       
    my $count_A=0;
    my $count_C=0;
    my $count_T=0;
    my $count_G=0;
    my $count_N=0;
    my $count_gap=0;
    foreach my $seq ($seqs->each_seq) {
        my $col=$seq->seq;
        if ($col eq 'A') {
            $count_A++;
        } elsif ($col eq 'C'){
            $count_C++;
        } elsif ($col eq 'T'){
            $count_T++;
        } elsif ($col eq 'G'){
            $count_G++;
        } elsif ( $col eq 'N'){
            $count_N++;
        } elsif ($col =~ m/^\Q$gap_char\E$/){
            $count_gap++;
        }
        $count{$seq->seq} += 1;
    }
    print"$i\t$count_A\t$count_T\t$count_C\t$count_G\n";
}

I get a warning like this: MSG: Slice [5-5] of sequence [3/1-6] contains no residues. Sequence excluded from the new alignment". and doesn't seem to count the gaps in the column.

alignment • 3.5k views
ADD COMMENT
1
Entering edit mode

Do you mean that the gap counting itself is not working or the result is not being printed out? You don't appear to be printing out the gap count anywhere in your code, so if it's not being printed out I would imagine that's the problem.

ADD REPLY
0
Entering edit mode
10.8 years ago
SimonCB765 ▴ 150

My Perl/BioPerl is a bit rusty, but I'll take a stab at answering this. The warning seems pretty straightforward. If you check the docs for slice it says that "Sequences with no residues in the slice are excluded from the new alignment and a warning is printed", which I imagine explains the warnings that you are getting. Assuming that slice works in the way that I'm understanding it to, then every sequence with a gap in it would be excluded from your alignment slice, and hence you would get 0 gaps.

ADD COMMENT

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6