I understand the meaning of N50 contig length, but lately I've been coming across assembly statistics that report the N50 contig number. Can someone explain what is meant by N50 contig number and how it relates to assembly quality?
N50-length is the length of that last contig that puts you over the top. N50-contig number (i.e. N50) is just the rank of that contig.
91 77 70 69 62 56 45 29 16 4[?]N50 vs N50 length[?] Technically N50, as opposed to N50 length, refers to the ordinal of that last contig that pushes it over the brink - in this example 4 (since 69bp is the 4th largest contig). Unfortunately, a higher N50 implies the opposite of a longer N50 length. Some papers refer to N50 length as L50, while most have simply followed the lazy convention of dropping "length" off of "N50 length". I think it is important to include units with your N50 to minimize confusion.
Recently there seems to be a nice change in nomenclatur:
You can speak of L50 and N50:
- L50 is the old N50 contig length.
- N50 is the number of contigs >= L50.
At first I was very confused about the change of the meaning of N50 but now I am used to the names. It is very helpful to distinguish the different meanings.
I'm guessing that N50 contig number is the number of contigs with length >= N50 length. Presumably as assembly improves, one would like total contig number to decrease and proportion of total contig number with length > N50 length to increase.