Problem with BUSCO alignments: more than one set of sequences for the same locus
Entering edit mode
12 weeks ago
silviaas • 0

Hi everyone,

I am working on getting a phylogeny based on BUSCOs extracted from low coverage genome assemblies. The coverage of my genomes range from 2-10X. As expected, the recovery success of complete single copy BUSCOs is very variable and rather low in most of the cases (~5-68%).

I generated a list of complete single copy BUSCOs for each terminal based on the .tsv output files and extracted the corresponding sequences directly from the single_copy_busco_sequences output folder. When I checked the individual loci alignments, I found that ~30% of the alignments contain more than one set of different sequences. In some cases the alignment contains only a couple of "weird" sequences. In other cases the alignment consist in 2 or more different sets of sequences. I attach here a couple of alignments as an example. I am sure the sequences are wrong because they affect to a random set of not closely related taxa.

I wanted to ask if anyone has experienced this issue before, and what could be the reason. The only reason I can imagine is that since the coverage is low, when the proper gene is not present maybe I am getting as best hit a wrongly assigned sequence. But even in that case, I wouldn't expect getting so many missasigned sequences, and sequences so different for the same BUSCO.

Finally, I tried to find an automatic strategy to clean the alignments, i.e. remove "weird" sequences from problematic alignments, or directly getting rid of the problematic alignments. But nothing I tried worked, and the only solution I found is removing the bad alignments manually.

I would appreciate any insight or suggestion about my problem I how could I solve it.

Thank you in advance.

Alignment 1

Alignment 2

Alignment 3

problem phylogenomics BUSCO alignment • 176 views

Login before adding your answer.

Traffic: 1679 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6