Question: Best Genome Assembler and Genome Annotation tools and pipelines
1
gravatar for margab
14 months ago by
margab10
margab10 wrote:

I want to assemble and annotate the kiwi (bird) genome. What is the best genome assemble tool and genome annotation pipeline I can use?

We have 11 libraries with several insert sizes from Apteryx mantelli genomic DNA and sequenced 83 billion base pairs (Gb) from small insert-size libraries and 120 Gb from large-insert mate-pair Illumina libraries. The kiwi's genome size is about 1.6 Gb. The assembled contigs and scaffolds cover approximately 96% of the complete genome with an average sequence coverage of 35.85-fold after correction.

The ones I have found from my research are MaSuRCA, Platanus, ALLPATHS-LG and ABySS for the genome assembly and BRAKER2, MAKER and CAT pipelines for genome annotation. For the de novo gene prediction and annotation we can also provide 47.5 Gb of transcript sequence data from kiwi embryonic tissue together with the de novo gene predictions and protein evidence from three well-annotated bird species.

Thank you in advance.

My goal is to re-assemble and re-annotate the genome of kiwi from the sequencing data provided by this article: DOI 10.1186/s13059-015-0711-4 I want to use new tools and pipelines in order to increase the efficiency of the assembly and the annotation.

ADD COMMENTlink modified 14 months ago by brian.fristensky130 • written 14 months ago by margab10

little confused about your question: it sounds like you already have lots of things done already (eg. you seem to already have an assembly). So what's the goal of looking for other software, (what have you used so far btw), are you not satisfied with the current results?

ADD REPLYlink modified 14 months ago • written 14 months ago by lieven.sterck8.6k

My goal is to re-assemble and re-annotate the genome of kiwi from the sequencing data provided by this article: DOI 10.1186/s13059-015-0711-4 I want to use new tools and pipelines in order to increase the efficiency of the assemble and the annotation. In this paper they used Soapdenovo2 and MAKER and Augustus.

ADD REPLYlink written 14 months ago by margab10
3
gravatar for colindaven
14 months ago by
colindaven2.3k
Hannover Medical School
colindaven2.3k wrote:

There is no "best" assembler for all datasets. I would recommend Soap2denovo to start with. Abyss is also good though produced shorter but highly accurate contigs in my experience. Allpaths LG requires particular data I believe, so I haven't used it.

Two comments:

  • the mate-pairs are critical to your analysis. Please check and remove duplicates from these data, they are known to have a very high duplicate content.
  • Long reads are far, far better than short reads for assembly. Why aren't you using these ? An assembly would be greatly improved by adding a couple of minion or promethion/pacbio runs from a decent service provider to make long and unfragmented contigs.

Also, a kiwi genome has already been sequenced and assembled in a highly fragmented fashion several years ago, maybe these data are useful.

ADD COMMENTlink written 14 months ago by colindaven2.3k

I will use the data from the paper you mentioned (DOI 10.1186/s13059-015-0711-4) and try to improve the results of the assembly and the annotation using other tools and pipelines or improved versions of these, if there are any (thats why i didnt mention the SOAPdenovo2).

ADD REPLYlink written 14 months ago by margab10
1
gravatar for brian.fristensky
14 months ago by
Canada/University of Manitoba
brian.fristensky130 wrote:

For de-novo genome assembly, you may wish to try the BIRCH system, which runs many of the popular assembly steps including read quality checking, trimming, error correction, and de-novo assembly using SOAPdenovo2, Spades or ABySS, and generates reports on assembly quality using Quast. All of these steps can be done using our BioLegato graphical interface. See the tutorial at http://home.cc.umanitoba.ca/~psgendb/birchhomedir/BIRCHDEV/public_html/tutorials/bioLegato/genome_assembly/genome.html, and a video demonstrating how BioLegato makes it easy to do these tasks at https://www.youtube.com/watch?v=56T05sOcODI.

ADD COMMENTlink written 14 months ago by brian.fristensky130
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1283 users visited in the last hour