Question

PacBio Genome assembly

0

Entering edit mode

6.1 years ago

julienlevy • 0

Hello,

I am doing a genome assembly I have run CANU and I am not sure what to look for in the output of CANU. I got (198.87 times coverage) (1) How do I evaluate my CANU output?

and this is my read mere (which seem small?)

Read length histogram (one '*' equals 48989.4 reads):
--        0   4999 3429258 **********************************************************************
--     5000   9999 2856365 **********************************************************
--    10000  14999 2424194 *************************************************
--    15000  19999 919354 ******************
--    20000  24999 341740 ******
--    25000  29999 120067 **

I am going to do a polishing step with pbalign. I also have a transcriptome available that I am planing to blast to my assembly.

(2 )What is the best tool to use the transcriptome to improve the assembly / annotate the genes ? (3) What should I do next ?

thanks

Assembly pacbio genome alignment gene • 1.6k views

ADD COMMENT • link updated 6.1 years ago by GenoMax 142k • written 6.1 years ago by julienlevy • 0

0

Entering edit mode

Some threads to consider looking through:
What can I do after my Pacbio genome assembly ?
Polish PacBio assembly with latest PacBio tools : an affordable solution for everyone

Not sure what is the size of the genome you are working with but MAKER or MAKER-P can be used for annotation.

ADD REPLY • link 6.1 years ago by GenoMax 142k

0

Entering edit mode

Thanks, the genome is 450gb

ADD REPLY • link 6.1 years ago by julienlevy • 0

1

Entering edit mode

Are you sure? Is it really 150x the human genome? Insn't it 450Mb?

ADD REPLY • link 6.1 years ago by h.mon 35k

0

Entering edit mode

You know, H. sapiens is not really the big dude when it comes to genome size :P

ADD REPLY • link 6.1 years ago by cschu181 ★ 2.8k

0

Entering edit mode

How come? We are the most complex organism ever created, we surely have the biggest genome, with the most genes, don't we?

On a serious note, though, I just google for biggest genome and was flabbergasted to discover genomes in the 150-250 billions base pairs. I was stuck at the loblolly pine 22Gb genome.

ADD REPLY • link 6.1 years ago by h.mon 35k

0

Entering edit mode

But 200x coverage of a 450Gb genome is... a whole lot of sequencing data o.O

And bigger than Paris japonica, the biggest genome currently known. So either OP made a mistake or is working on the biggest meanest most impressive genome ever.

ADD REPLY • link 6.1 years ago by WouterDeCoster 47k