Question

How to analyse Eukaryotic genome by Comparative annotation toolkit (CAT) with BUSCO output?

0

Entering edit mode

4.8 years ago

kabir.deb ▴ 80

Hello, I'm very new in Eukaryotic genome annotation; going through internet I found Comparative annotation toolkit (CAT) is very good tool for doing so....I have installed Comparative annotation toolkit (CAT) using Anaconda after creating a separate python=2.7 environment using following commands. While I have installed BUSCO into base environment.

$conda create -n catpy27 python=2.7 pip
$conda activate catpy27
$conda install -c bioconda comparative-annotation-toolkit

Now I don't understand what to do with that? Anyone acquainted with this kind of tool please suggest me any good tutorial or manual for Eukaryotic genome annotation using CAT.

alignment genome sequence sequencing • 2.0k views

ADD COMMENT • link 4.8 years ago by kabir.deb ▴ 80

0

Entering edit mode

Hi thank you very much for your prompt reply, I already have BUSCO output which consists AUGUSTUS result as well as BUSCO Ortholog genes. Now I'm in mystery that how can I relate my BUSCO output (Ortholog genes, Augustus output etc.) with the Comparative annotation toolkit (CAT).

ADD REPLY • link 4.8 years ago by kabir.deb ▴ 80

0

Entering edit mode

How many genomes are you annotating and are there reference genomes/annotations with high quality for CAT?

A way to relate your BUSCO output with CAT is to reuse the pre-trained parameters resulting from BUSCO. You can add --long option when you use BUSCO on an assembly to optimize when retraining AUGUSTUS during the process. Then reuse the retraining parameters as a custom species for CAT by specifying --augustus-species CUSTOM_SPE. Note that you need to move the files from the directory retraining_parameters in BUSCO results to the species directory in your AUGUSTUS config path and rename the files (replace the prefix with your species name; for example, make them augustus/config/species/CUSTOM_SPE/CUSTOM_SPE_exon_probs.pbl and augustus/config/species/CUSTOM_SPE/CUSTOM_SPE_intron_probs.pbl)

If you use CAT in the Docker way, you need to rebuild the image to include your custom parameter files. See discussions: Custom Augustus training parameters for --augustus-species....

ADD REPLY • link 4.8 years ago by AK ★ 2.2k

score 1 · Answer 1 · 2019-07-13

First, have a look at the repo: https://github.com/ComparativeGenomicsToolkit/Comparative-Annotation-Toolkit and the section: Running the pipeline, then make use of the test_data to get it working. From experience, I would suggest implementing CAT in the Docker way.

When you get the test data working, you'll probably know that you need the whole genome alignment in .hal format (ex. vertebrates.hal in the test data) before running CAT. You can get the alignment by implementing cactus. There is also test data for cactus as well, you'll need a species tree for it.

If unfortunately, you couldn't get the test data working for CAT or cactus, try having a look at the alternative method: Multi-Genome Annotation with AUGUSTUS, the instructions there might be easier to follow.