how to glob a dir into bioperl's
6.8 years ago
qwerty

Hi, I would like to align separately several 1000 pairs of protein sequences using bioperl's - I place each file in a directory and would like to write code to feed each file into and output a separate alignment for each file. The code below works when using Bio::SeqIO but isn't working for alignments - any help greatly appreciated

#!/usr/bin/perl -w

use Bio::Tools::Run::Alignment::Clustalw;
use warnings;

my $dir = './genes';
foreach my $fp (glob("$dir/*.fa")) {
open my $fh, "<", $fp or die;

@params = ('ktuple' => 2, 'matrix' => 'BLOSUM', 'outfile' => '$fp.pep');
$factory = Bio::Tools::Run::Alignment::Clustalw->new(@params);

$inputfilename = '$fp.fasta.fa';
$aln = $factory->align($inputfilename); 
6.8 years ago
SES

It is obvious what you are trying to do, but I don't know what you mean by "works with Bio::SeqIO" because it would take a complete re-write for this code to do anything with Bio::SeqIO. Anyway, the main issues are that you are trying to use a lexical variable outside of the scope that it can be seen, and you are trying to interpolate a variable into a string with single quotes and you need double quotes for variable interpolation. As an aside, the "-w" is not necessary and use strict; and use warnings; should be at the top of your script to help catch errors. Here is how I would approach the problem (untested):

The glob() function is portable and would also work fine. I favor File::Find because it allows for file tests, it returns the full path (as written), and it is amazingly fast. If you want to do anything with the $aln object, you'll need to import another class like Bio::AlignIO.


