Question: How to analyse Eukaryotic genome by Comparative annotation toolkit (CAT) with BUSCO output?
gravatar for kabir.deb0353
15 months ago by
kabir.deb035310 wrote:

Hello, I'm very new in Eukaryotic genome annotation; going through internet I found Comparative annotation toolkit (CAT) is very good tool for doing so....I have installed Comparative annotation toolkit (CAT) using Anaconda after creating a separate python=2.7 environment using following commands. While I have installed BUSCO into base environment.

$conda create -n catpy27 python=2.7 pip
$conda activate catpy27
$conda install -c bioconda comparative-annotation-toolkit

Now I don't understand what to do with that? Anyone acquainted with this kind of tool please suggest me any good tutorial or manual for Eukaryotic genome annotation using CAT.

ADD COMMENTlink modified 15 months ago • written 15 months ago by kabir.deb035310

Hi thank you very much for your prompt reply, I already have BUSCO output which consists AUGUSTUS result as well as BUSCO Ortholog genes. Now I'm in mystery that how can I relate my BUSCO output (Ortholog genes, Augustus output etc.) with the Comparative annotation toolkit (CAT).

ADD REPLYlink modified 15 months ago • written 15 months ago by kabir.deb035310

How many genomes are you annotating and are there reference genomes/annotations with high quality for CAT?

A way to relate your BUSCO output with CAT is to reuse the pre-trained parameters resulting from BUSCO. You can add --long option when you use BUSCO on an assembly to optimize when retraining AUGUSTUS during the process. Then reuse the retraining parameters as a custom species for CAT by specifying --augustus-species CUSTOM_SPE. Note that you need to move the files from the directory retraining_parameters in BUSCO results to the species directory in your AUGUSTUS config path and rename the files (replace the prefix with your species name; for example, make them augustus/config/species/CUSTOM_SPE/CUSTOM_SPE_exon_probs.pbl and augustus/config/species/CUSTOM_SPE/CUSTOM_SPE_intron_probs.pbl)

If you use CAT in the Docker way, you need to rebuild the image to include your custom parameter files. See discussions: Custom Augustus training parameters for --augustus-species....

ADD REPLYlink modified 15 months ago • written 15 months ago by AK1.9k
gravatar for AK
15 months ago by
AK1.9k wrote:

First, have a look at the repo: and the section: Running the pipeline, then make use of the test_data to get it working. From experience, I would suggest implementing CAT in the Docker way.

When you get the test data working, you'll probably know that you need the whole genome alignment in .hal format (ex. vertebrates.hal in the test data) before running CAT. You can get the alignment by implementing cactus. There is also test data for cactus as well, you'll need a species tree for it.

If unfortunately, you couldn't get the test data working for CAT or cactus, try having a look at the alternative method: Multi-Genome Annotation with AUGUSTUS, the instructions there might be easier to follow.

ADD COMMENTlink modified 15 months ago • written 15 months ago by AK1.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1932 users visited in the last hour