I want to do phylogenetic analyses based on single-locus data sets as well as based on the combined multiple-gene dataset.
For the single-locus data set, I had found out two most suitable outgroups by BlastN against public database, however, the sequences of the two outgroup taxa are shorter than my ingroup sequences at both ends (5' end and 3' end) although there is perfect alignment between outgroups and ingroups. I don't want to trim my ingroups from both ends of the alignment because valuable informative characters are included in those regions (i.e., the regions where my ingroups have but outgroups don't). I want to know in my case, can I still use the two outgroups with some alignment gaps (or more strict, missing ) being kept at both ends of the outgroup sequences?
My another concern is when I use one outgroup taxon, no support value is shown for the ingroup clade, but when two outgroup taxa is used, there is 100% support for the ingroup clade. I want to know why is so, and do I have to use at least two outgroup taxa.
For the combined dataset, my question is also about the outgroups. Because different outgroup taxa were used for each single-locus data sets, how should I determine the outgroups for multiple-loci data set. Can I concatenate together those outgroup sequences from each single-locus data set? By doing so, I may make the artificial taxon/sequences.
Hope to have your help!