Question: multifasta length uniformity
0
gravatar for Unalibun
6 months ago by
Unalibun0
Unalibun0 wrote:

Hi everyone, I am performing the phylogenetic reconstruction of my non-model organism, I am working with the BUSCOs output (single copy sequences files) form 7 different species, I have more than 1250 small multi-fasta files generated as output by Guidance2 software.

Ej EOG092D4S9W.faa.aln

>PAS_EOG092D4S9W
MAHRIISQVVLTGARVFGRAFAEAYKQASASQ

>PSE_EOG092D4S9W
MAHRIVTQVLITGARVFGR

>PSE_EOG092D4S9W
MAHRIVTQVLVTGARVFGRAFAEAYKQASASQKFAQQN

I am trying to do my Bootstrapping but the sequence generated are not accepted by raxlm program I have to uniform them to the same length but I can not find a program to do that recursively.

This is how the multifasta should looks like. However, every single files has sequences with different lengths so I can not choice an standard number to trimming the sequences. I think that the solution could be calculate the length smaller sequence and then cutting the others using the calculated length as a reference.

>PAS_EOG092D4S9W
MAHRIISQVVLTGARVFGRAFAEAYKQASAS

>PSE_EOG092D4S9W
MAHRIVTQVLITGARVFGRAFAEAYKQASAF

>PSE_EOG092D4S9W
MAHRIVTQVLVTGARVFGRAFAEAYKQASA

Can anybody help to untingling this issue me please

alignment • 159 views
ADD COMMENTlink modified 6 months ago by Mensur Dlakic6.6k • written 6 months ago by Unalibun0
1

You should align the sequences, not trim them.

ADD REPLYlink modified 6 months ago • written 6 months ago by Asaf8.4k
1

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

P.S. length is spelled leng-T-H, not leng-H-T. I've corrected those in the post too.

ADD REPLYlink written 6 months ago by RamRS30k
4
gravatar for Mensur Dlakic
6 months ago by
Mensur Dlakic6.6k
USA
Mensur Dlakic6.6k wrote:

A proper way to do this is by aligning sequences first, and trimming them afterwards so they are all the same length. Some positions in the alignment will be gaps rather than letters, which is normal. I have responded recently to a similar question here and here.

ADD COMMENTlink written 6 months ago by Mensur Dlakic6.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 930 users visited in the last hour