Question: Get BUSCO gene descriptions
2
gravatar for pbigbig
3.9 years ago by
pbigbig200
United States
pbigbig200 wrote:

Hi everyone,

I am planning to design primers (to run Sanger sequencing) for assessment of a genome de novo assembly. These primers can be chosen arbitrary, but I prefer to have some meaning of sequenced results, therefore I run BUSCO eukaryote (~400 single-copy orthologs) on the de novo assembly genome. BUSCO run revealed ~60% Complete Single-Copy BUSCOs, but I wonder how could I get to know the name and description of those orthologs in eukaryote set (there are only alignments and numbered code for matches in results)? I really appreciated any help.

Thank you very much in advance!

busco de novo assembly • 2.2k views
ADD COMMENTlink modified 6 months ago • written 3.9 years ago by pbigbig200

Also very interested in this, have you found an answer?

ADD REPLYlink written 3.6 years ago by twooldridge0

Sadly not yet, but I could still obtain those ortholog's fasta sequences in BUSCO results and Blast them against Refseq database to get best hit accession ID, then simply refer list of these IDs for descriptive titles (I used Batch Entrez http://www.ncbi.nlm.nih.gov/sites/batchentrez)

ADD REPLYlink written 3.6 years ago by pbigbig200
3
gravatar for thackl
6 months ago by
thackl2.8k
MIT
thackl2.8k wrote:

Just came across the same issue, and came up with a solution. Most BUSCO data sets are generated from OrthoDB. You can query OrthoDB via its API to map BUSCO IDs and pull the information. I've posted a short R snippet to automate this and produce a nice table https://thackl.github.io/BUSCO-gene-descriptions

ADD COMMENTlink written 6 months ago by thackl2.8k
1

Oh great! Thank you very much! Although the post was long time ago but I think it still very useful for other de novo genome project.

ADD REPLYlink written 6 months ago by pbigbig200
1

Yeah, I was hoping you had moved on by now ;)

ADD REPLYlink written 6 months ago by thackl2.8k
0
gravatar for william.imart
15 months ago by
william.imart0 wrote:

If you load the FASTA sequence into IGV you can look at the entire genome alongside the genes they code for. From there you can search the name and function of each of these genes.

ADD COMMENTlink written 15 months ago by william.imart0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1794 users visited in the last hour