Question: Comparing the size of assemblies, contiguity and busco stats of multiple isolates genome
0
gravatar for nagarsaggi
13 months ago by
nagarsaggi10
nagarsaggi10 wrote:

I have the spades assembly of 109 samples of a plant pathogenic fungi. I have done BUSCO analysis for all the isolates. I want to compare the size of the assembly and contiguity with the size of the input data. How do I calculate and extract the assembly stats of each isolate in a tabular form? I also want to compare the size of the assemblies with the BUSCO stats (complete, partial and duplicate busco), so how do I extract the busco stats from the "short summary file" to a table for each isolate?

assembly • 294 views
ADD COMMENTlink modified 13 months ago by jean.elbers1.3k • written 13 months ago by nagarsaggi10
0
gravatar for jean.elbers
13 months ago by
jean.elbers1.3k
jean.elbers1.3k wrote:

You could play around with bash scripting and BBTools/BBMap's (https://sourceforge.net/projects/bbmap/) bbstats.sh or statswrapper.sh scripts for assembly statistics (note that these scripts flip N50 and L50 values from their definitions and likewise N90 and L90). In terms of the BUSCO stats, that is more of a text manipulation job using GNU core utilities or Perl, sed, awk, etc. If you post an example of the output and the desired result, perhaps someone can help you write a quick script to get the desired result.

ADD COMMENTlink written 13 months ago by jean.elbers1.3k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2311 users visited in the last hour