I am working with amino acid sequences.
I got a fasta file containing reference sequences and a txt file giving the categorization (domain, subdomain, nickname etc etc etc.) for each sequence. First thing, I need to construct a tree with these sequences to check that they group according to the categories they belong to. Example: sequences belonging to domain A group together, sequences belonging to subdomain A1 group together and so on.
Then... I got a fasta file containing OTU representatives derived from metagenomic reads AND a OTU count table declaring how many of each of these representatives are present in each one of my 20 samples. I need to align these OTU sequences to the previously constructed tree ~AND~ consider the per-sample count, to check the categories each sample is enriched for.
What softwares should I use?