I am developing a workflow for high-throughput identification of lateral gene transfer candidates. I have attached kingdom level taxonomic identifiers to each of a set of homologous sequences, which were then aligned and used to build a tree.
I now have a large number of newick formatted trees and would like to search them for interesting clades. i.e. Return trees with clades that contain sequences from two or more kingdom-level taxa AND where the count of taxa from one group is more than double the other.
Does anyone know of an existing tool or method to search tree topologies with logical queries?