How to determine the best assembly with number of N in the sequence and how to find Telomere and centromere markers
Entering edit mode
3 months ago
Théo • 0


I have 3 fasta files of fungi genome assembly of 3 different assembler tools and in my fasta files there are some N characters wich represent the lot of Transposable elements in my genomes.

And i wanted to choose the best assembly beetween the 3 files compared to the N.

Is there a rule for N's that says that the one with the least N's is the best ?

There is a cutoff value for the N ?

I have also an other question : What is the way to identify centromer and telomer if they are masked because all repeted regions are N ?

Do i need to check about Repeat Maskers options?

Thanks for your answers.

centromere assembly telomere genome • 233 views
Entering edit mode
3 months ago
liorglic ▴ 880

There are several measures for the quality of an assembly, e.g. contiguity (N50, N90), total assembly size, BUSCO score, and also the % of Ns in the assembly. If all other stats are similar, then generally the assembly with lowest % Ns should be favored.
You can try a software like QUAST that will calculate many assembly stats for you, so you can easily compare your results.


Login before adding your answer.

Traffic: 2693 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6