BBmap statistic values
1
0
Entering edit mode
19 months ago
Dũng • 0

Hello everyone, I'm naively have question about statistic values of BBtools package.

A       C       G       T       N       IUPAC   Other   GC      GC_stdev
0.2781  0.2185  0.2201  0.2833  0.0000  0.0000  0.0000  0.4386  0.0742

Main genome scaffold total:             654980
Main genome contig total:               654980
Main genome scaffold sequence total:    440.298 MB
Main genome contig sequence total:      440.298 MB      0.000% gap
Main genome scaffold N/L50:             78917/966
Main genome contig N/L50:               78917/966
Main genome scaffold N/L90:             476199/274
Main genome contig N/L90:               476199/274
Max scaffold length:                    285.103 KB
Max contig length:                      285.103 KB
Number of scaffolds > 50 KB:            75
% main genome in scaffolds > 50 KB:     1.33%


Minimum         Number          Number          Total           Total           Scaffold
Scaffold        of              of              Scaffold        Contig          Contig
Length          Scaffolds       Contigs         Length          Length          Coverage
--------        --------------  --------------  --------------  --------------  --------
    All                654,980         654,980     440,298,315     440,298,315   100.00%
     50                654,980         654,980     440,298,315     440,298,315   100.00%
    100                651,786         651,786     440,017,984     440,017,984   100.00%
    250                557,284         557,284     417,980,258     417,980,258   100.00%
    500                191,879         191,879     295,948,179     295,948,179   100.00%
   1 KB                 75,404          75,404     216,699,604     216,699,604   100.00%
 2.5 KB                 22,484          22,484     137,539,271     137,539,271   100.00%
   5 KB                  8,403           8,403      89,136,159      89,136,159   100.00%
  10 KB                  2,650           2,650      49,849,853      49,849,853   100.00%
  25 KB                    428             428      17,517,791      17,517,791   100.00%
  50 KB                     75              75       5,861,757       5,861,757   100.00%
 100 KB                     10              10       1,496,255       1,496,255   100.00%
 250 KB                      1               1         285,103         285,103   100.00%

+Main genome scaffold sequence total: 440.298 MB: Is it can be my genome size? or I need to use another tool to calculate genome size? +Main genome scaffold N/L50: 78917/966 : the first value (78917) is that N50 length? Many thanks in advance! Stephen..

BBmap stats.sh • 1.2k views
ADD COMMENT
0
Entering edit mode

Hello Dũng,

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode
19 months ago

No, there is no such thing as N50 length (to my best knowledge).

contig N/L50: 78917/966 means that you have 78917 contigs in total (n=78917). If you sort them by length in a decreasing order, you will need the top 966 longest contigs out of the 78917 total to get to a combined sequence length of 50%. The L50 and L90 values are quantiles of an empirical cumulative distribution function (eCDF) describing the length distribution of your contigs.

Since the number of scaffolds and contigs is identical (and the L50/L90 vales not satisfactory), you will now need to proceed to scaffolding using HiC data. (Consult the Genome Assembly Cookbook for further information).

ADD COMMENT
1
Entering edit mode

Yes there is : N50 denotes the length , and L50 the number of contigs.

https://en.wikipedia.org/wiki/N50,_L50,_and_related_statistics

From those numbers the L50 is 78917 (== you need the biggest 78917 contigs to get to 50% of the genome/assembly) , the N50 here is 966 (indicates the length of the L50 contig )

It seems though in this output they used the wrongly ... (should switch the around)

And yes, it would have made much more sense the use L50 for the length and N50 for the number :)

ADD REPLY

Login before adding your answer.

Traffic: 3443 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6