I need to know if what I'm doing is makes sense. I am working on a fungus for my undergraduate thesis. I do literature review as well as bioinformatics analysis. I am not that familiar with bioinformatics stuff, so I really in need of help here.
For bioinformatics analysis, I'm interested in doing phylogenetic tree of some pathogenicity genes of this one fungus species and see if they are similar or related to other organisms. So here is what I do so far/ or the way that I can think of right now:
I selected 5 pathogenicity genes which are myosin-related (got the sequences from UniProt). Then, I BLAST them with the other organisms' databases (ie human, mouse, other fungi, bacterium etc.) Then, for each of the database 'blasted', I pick the sequence that has lowest E value or the first one appear on the list. Then, I aligned the sequences of the 5 pathogenicity genes with other collective sequences from other organisms using ClustalOmega. Followed by the construction of the phylogeny tree using the ClustalW2. I would really appreciate if anyone can tell me if I'm on the right track or not.
Also, seems that some of the 'first ranked' sequence from the databases have high E value (ie: closed to 1 instead of 0). So, are they reliable if I were to use them in constructing the phylogenetic tree?
On the other hand, I did try and 'play' with the BLAST. It appears that the 5 pathogenicity genes that are myosin-related have no significant similarity with human. Is anyone know what might have caused this? Is it explainable? Or it is because they really have no connection to each other? I have another 5 sets of pathogenicity genes but they are from various domain/types (all mixed up). If I use them to construct the phylogenetic tree, is it makes sense? (considering they are not from the same domains/types)
I really appreciate if anyone can comment on these questions and help me out here.
Thank you in advance!