scaffold vs chromosome level assembly
1
0
Entering edit mode
2.4 years ago
jaqx008 ▴ 110

Hello all. I have two genomes from related species. One is assembled to scaffold level and the other is reported on ncbi to be assembled to chromosome level. Is it safe to report in an article the chromosome level assembly genome is more superior to the scaffold one? Do I need to prove this by running busco or its safe to draw this conclusion.

Thank you

assembly chromosom scaffold • 1.5k views
ADD COMMENT
2
Entering edit mode
2.4 years ago
Michael 54k

Superior is up to you to define and depends on your criteria. Ideally, if all other parameters are the same, and possibly in most cases, one would choose a chromosome level assembly over a scaffold assembly. However, there are many things that can go wrong also with a recent chromosome level assembly based on long reads, and what that term even means is often up to the authors to decide. For example, I know of a published genome where the authors claim chromosome scale but completeness by BUSCO and mapping RNAseq is only ~60%. Would you even call that chromosome scale? Contributing to that demise was improper filtering and also probably that the authors assumed a given number of chromosomes and threw away any unplaced scaffold beyond that. Another example is an apparently pretty nice and complete not yet published assembly of a related organism that has high completeness and contingency but was plagued with bacterial contaminants such that NCBI had removed a number of contigs twice. In addition, modern sequencing technology has made big advances, but there can be errors in an assembly, and there is natural variation, different strains, etc. A single genome sequence can only be seen as a frozen model for the genomic inventory that is present in a species. Further, you have to work with what is there, if different related species have different assembly levels, then that is what you get.

In conclusion, yes, you should do your own comparisons or at least compare multiple published measurements before drawing conclusions of fitness for a given purpose. Using BUSCO, mapping back transcripts, and other core gene sets for completeness and evaluation of contingency. Using BUSCO, one should keep in mind that it may disadvantage parasite genomes or generally genomes undergoing accelerated evolution.

ADD COMMENT
0
Entering edit mode

Thank you Michael, this is very helpful.

ADD REPLY

Login before adding your answer.

Traffic: 1672 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6