I am using PRANK to align multiple FASTA files containing orthogroups as identified in OrthoFinder for Bayesian analysis. I have noticed that when I run MrBayes, certain alignments will not load/run. Inspection of the alignments revealed that PRANK had abbreviated names from the original FASTA and as a result two of them were identical causing an Error in MrBayes. I am wondering if this is a result of the way I ran PRANK and if there is a way to get it to output the entire gene ID to the resulting nexus files.
I am running PRANK using its default settings as so:
prank -d=<input_fasta> -f=nexus -o=<output_nexus>
The input fasta has these taxa:
Itaiw_v1_scaffold_34_t22466-RA MLMCVLIANSSGNVLLERFHGVPGEERLHWRSFLVKLGTDNLKGARDDEPFIASHKSVYV VYGIIGDIWIFTVGKDEYDELTLVEVLYSITSSIKEVCKKAPNERLFLDNYGKVCLCLDE ICAQGMLEHTDKGRIRRLIRLRPLVDT* Itaiw_v1_scaffold_34_t22466-RB MLMCVLIANSSGNVLLERFHGVPGEERLHWRSFLVKLGTDNLKGARDDEPFIASHKSVYV VYGIIGDIWIFTVGKDEYDELTWNVRAHR*
Output looks like this:
'Itaiw_v1_scaffold_34' -----------------------------MLMCVLIANSSGNVLLERFHGVPGEERLHWR (...) 'Itaiw_v1_scaffold_34' -----------------------------MLMCVLIANSSGNVLLERFHGVPGEERLHWR (...)
Has anyone else run into this problem using PRANK? Is there a way to modify my PRANK command or my input files so that I get full names? I am running this analysis on thousands of orthogroup files so fixing them one by one is not an option. I realize that this question is fairly specific but I couldn't find any reference to this sort of thing in the PRANK docs and a cursory Google search didn't give me much either- So here I am!