Emboss Cons for getting consensus sequence for many files, not just one
0
0
Entering edit mode
5.4 years ago
roblogan6 ▴ 30

I installed and configured emboss and can run the simple command line arguments for getting the consensus of one previously aligned multifasta file:

% cons

Create a consensus sequence from a multiple alignment

Input (aligned) sequence set: dna.msf

output sequence [dna.fasta]: aligned.cons

This is perfect for dealing with one file at a time, but I have hundreds to process. I have started to write a perl script with a foreach loop to try and process this for every file, but I guess I need to be outside of the script to run these commands. Any clue on how I can run a command line friendly program for getting a single consensus sequence in fasta format from a previously aligned multifasta file, for many files in succession? I don't have to use emboss- I could use another program. Here is my code so far:

#!/usr/bin/perl 
use warnings; 
use strict; 

my $dir = ("/Users/roblogan/Documents/Clustered_Barcodes_Aligned");

my @ArrayofFiles = glob "$dir/*"; #put all files in the directory into an array

#print join("\n", @ArrayofFiles), "\n";  #diagnostic print

foreach my $file (@ArrayofFiles){
        print 'cons', "\n";
        print "/Users/roblogan/Documents/Clustered_Barcodes_Aligned/Clustered_Barcode_Number_*.*.Sequences.txt.out", "\n";
        print "*.*.Consensus.txt", "\n"; 
}
perl emboss consensus bioinformatics fasta • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2391 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6