Hi everyone, I am performing the phylogenetic reconstruction of my non-model organism, I am working with the BUSCOs output (single copy sequences files) form 7 different species, I have more than 1250 small multi-fasta files generated as output by Guidance2 software.
Ej EOG092D4S9W.faa.aln
>PAS_EOG092D4S9W
MAHRIISQVVLTGARVFGRAFAEAYKQASASQ
>PSE_EOG092D4S9W
MAHRIVTQVLITGARVFGR
>PSE_EOG092D4S9W
MAHRIVTQVLVTGARVFGRAFAEAYKQASASQKFAQQN
I am trying to do my Bootstrapping but the sequence generated are not accepted by raxlm program I have to uniform them to the same length but I can not find a program to do that recursively.
This is how the multifasta should looks like. However, every single files has sequences with different lengths so I can not choice an standard number to trimming the sequences. I think that the solution could be calculate the length smaller sequence and then cutting the others using the calculated length as a reference.
>PAS_EOG092D4S9W
MAHRIISQVVLTGARVFGRAFAEAYKQASAS
>PSE_EOG092D4S9W
MAHRIVTQVLITGARVFGRAFAEAYKQASAF
>PSE_EOG092D4S9W
MAHRIVTQVLVTGARVFGRAFAEAYKQASA
Can anybody help to untingling this issue me please
You should align the sequences, not trim them.
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.P.S. length is spelled leng-T-H, not leng-H-T. I've corrected those in the post too.