Emboss Cons for getting consensus sequence for many files, not just one
Entering edit mode
5.4 years ago
roblogan6 ▴ 30

I installed and configured emboss and can run the simple command line arguments for getting the consensus of one previously aligned multifasta file:

% cons

Create a consensus sequence from a multiple alignment

Input (aligned) sequence set: dna.msf

output sequence [dna.fasta]: aligned.cons

This is perfect for dealing with one file at a time, but I have hundreds to process. I have started to write a perl script with a foreach loop to try and process this for every file, but I guess I need to be outside of the script to run these commands. Any clue on how I can run a command line friendly program for getting a single consensus sequence in fasta format from a previously aligned multifasta file, for many files in succession? I don't have to use emboss- I could use another program. Here is my code so far:

use warnings; 
use strict; 

my $dir = ("/Users/roblogan/Documents/Clustered_Barcodes_Aligned");

my @ArrayofFiles = glob "$dir/*"; #put all files in the directory into an array

#print join("\n", @ArrayofFiles), "\n";  #diagnostic print

foreach my $file (@ArrayofFiles){
        print 'cons', "\n";
        print "/Users/roblogan/Documents/Clustered_Barcodes_Aligned/Clustered_Barcode_Number_*.*.Sequences.txt.out", "\n";
        print "*.*.Consensus.txt", "\n"; 
perl emboss consensus bioinformatics fasta • 1.7k views

Login before adding your answer.

Traffic: 2391 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6