I have a data file containing taxa labels (in one column), followed by a numeric value for each taxon (separated by one tabulation, in another column). This file is an external data set that brings one type of information for each taxon of my phylogenetic analysis. I also have constructed a phylogenetic tree with all the same taxa sequences (with IQ-Tree). With my tree mapped with my external data set (via iTol), when I get closer to some specific taxa, there is a correlation or a trend, at least, with the external data.
I want to test the significance of this trend/correlation between the external data for each taxon (continuous/numeric values) and pairwise genetic distances to some predefined taxa (events). Is there a test that I can use directly on my tree file and external data file (in IQ-Tree or other software/package)?
I am thinking about comparing the mean of my external data from different groups of taxa, selected in terms of their genetic distance to a predefined taxon#i. Different groups representing increasing intervals of genetic distance (for example: group1 [close to taxon #i], group2 [far from taxon #i]) to one event (predefined taxon#i) would be used. Does this strategy seem good or feasible? I'm not really sure how to extract my data within my tree and my external data file, and if a test/program already exits it will save me valuable time ;).