Question: sorting an alignment in fasta format after the tip order in a phylogenetic tree
gravatar for amartinez.ull
3 months ago by
amartinez.ull0 wrote:

Dear all, I am trying to sort the sequences in a fasta file after the tip order of a phylogenetic tree containing the same sequences but arranged in a different order. Do you know any quick way to do it in R or in Biopython?

So far, I only managed to do it if by first converting the fasta file into a .csv, but this is not very efficient...

Thanks a lot, Alejandro

alignment • 152 views
ADD COMMENTlink modified 3 months ago by shenwei3564.0k • written 3 months ago by amartinez.ull0

Hard to answer without the proper phylogenetic tree. I would say that you need to move in a list your fasta headers ordered from your phylogenetic tree (this is the hard part actually). Then, with Biopython you will be able to sort your fasta sequences according to that list.

ADD REPLYlink written 3 months ago by Bastien Hervé1.5k

Thanks for the answer. I will try to be more precise... I have this fasta file:


>seq2 cgcgcggc

>seq3 tctctctc


and I want to sort it after a text file containing this:





Do you know how to do it? Thanks a lot for your time!

ADD REPLYlink modified 3 months ago by Pierre Lindenbaum110k • written 3 months ago by amartinez.ull0
gravatar for shenwei356
3 months ago by
shenwei3564.0k wrote:

Using fasta index.

If the ID list is not long, simply paste the IDs into cmd.

samtools faidx seqs.fasta $(paste -s -d " " ids.txt) > result.fasta


seqkit faidx seqs.fasta $(paste -s -d " " ids.txt) > result.fasta

For large number of IDs:

cat ids.txt | parallel -k seqkit faidx seqs.fasta {} > result.fasta
ADD COMMENTlink written 3 months ago by shenwei3564.0k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 521 users visited in the last hour