I have a large orthologous tree file (newick format) with one tree per line. I only want the single-copy gene trees (one copy per one species). I tried to select by the ID of the species with awk command but I always modified and damaged the trees.
The simplest tree is like ("(", ID of the species 1, dot, name of the gene, ":", distance value, ID of the species 2, dot, name of the gene, ":", distance value, ")", ";" ):
Any script idea to select the single-copy gene trees?