Question

Busco result interpretation

0

Entering edit mode

6.7 years ago

popayekid55 ▴ 110

Dear all,

I have assembled a algal genome n predicted ~9k genes using Augustus. Genes were evaluated using Busco. Only 132 genes were put under different category of Busco (C,S,D,F and M). When I checked for C reinhardtii, around 300 genes were put under those categories. But in article they are telling ~80% (do not recall exact number) of d genes are complete.

My question is how to interpret Busco output??

Thank you

gene genome • 6.6k views

ADD COMMENT • link updated 16 months ago by Ram 44k • written 6.7 years ago by popayekid55 ▴ 110

0

Entering edit mode

somewhat related issue :

A: Is BUSCO really better than CEGMA for genome assembly quality evaluation?

ADD REPLY • link 6.7 years ago by lieven.sterck 15k

0

Entering edit mode

I recently was in a similar situation and in the end figured out that BUSCO is not really suited (adequate) for estimating completeness in algal genomes, mainly due to biases in their core gene set.

ADD REPLY • link 6.7 years ago by lieven.sterck 15k

0

Entering edit mode

Any other approach you opted to address the concern??

ADD REPLY • link 6.7 years ago by popayekid55 ▴ 110

0

Entering edit mode

Yes, the approach mentioned in that veeckman et al paper. Not sure if that is a public dataset though , they are colleagues from me here in the lab so I could easily get hold of it :)

Combined with more classical approaches such as fraction fo RNAseq mapped (and genes covered by etc) stats

ADD REPLY • link 6.7 years ago by lieven.sterck 15k

score 0 · Answer 1 · 2018-03-12

0

Entering edit mode

6.7 years ago

charles.bridges ▴ 70

Not sure if CheckM will work for you, but I've found it extremely useful for bacterial genomes

ADD COMMENT • link 6.7 years ago by charles.bridges ▴ 70

0

Entering edit mode

I think CheckM is primarily for bacterial and archaeal genomes

ADD REPLY • link 6.6 years ago by drikaul ▴ 20

0

Entering edit mode

Hi,

What is the best output for BUSCO? What does low number means? For example: C:18.4%[S:18.4%,D:0.0%],F:10.8%,M:70.8%,n:332

This is my Assembly stats:

Total length (>= 1000 bp) 1653501 contigs 8
Total length 1653501 Largest contig 1488929

Thanks

ADD REPLY • link 4.5 years ago by bioinform_1 • 0

0

Entering edit mode

Hi bioinform_1,

just to mention it's not good practise to add new questions to an existing (old) thread. You're better off opening a new thread with your question.

I'll be happy to reply with advise/answers when you open a new thread ;)

thx.

ADD REPLY • link 4.5 years ago by lieven.sterck 15k