how to glob a dir into bioperl's clustal.pm?
1
0
Entering edit mode
6.8 years ago
qwerty • 0

Hi, I would like to align separately several 1000 pairs of protein sequences using bioperl's clustal.pm - I place each file in a directory and would like to write code to feed each file into clustal.pm and output a separate alignment for each file. The code below works when using Bio::SeqIO but isn't working for alignments - any help greatly appreciated

#!/usr/bin/perl -w

use Bio::Tools::Run::Alignment::Clustalw;
use warnings;

my $dir = './genes'; foreach my$fp (glob("$dir/*.fa")) { open my$fh, "<", $fp or die; } @params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'outfile' => '$fp.pep');
$factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);$inputfilename = '$fp.fasta.fa';$aln = factory->align(inputfilename); 
bioperl perl clustal glob • 1.7k views
0
Entering edit mode
6.8 years ago
SES 8.5k

It is obvious what you are trying to do, but I don't know what you mean by "works with Bio::SeqIO" because it would take a complete re-write for this code to do anything with Bio::SeqIO. Anyway, the main issues are that you are trying to use a lexical variable outside of the scope that it can be seen, and you are trying to interpolate a variable into a string with single quotes and you need double quotes for variable interpolation. As an aside, the "-w" is not necessary and use strict; and use warnings; should be at the top of your script to help catch errors. Here is how I would approach the problem (untested):

The glob() function is portable and would also work fine. I favor File::Find because it allows for file tests, it returns the full path (as written), and it is amazingly fast. If you want to do anything with the \$aln object, you'll need to import another class like Bio::AlignIO.