Question

How many short-reads are enough to do a valid bacterial genome assembly using Velvet and to perform a good annotation?

0

Entering edit mode

6.5 years ago

bioinformatics_bel ▴ 20

I need to know how many short-reads are enough to do a valid bacterial genome assembly using Velvet and to perform a good annotation? Is it correct to use only a pair of .fastqs? The second part of the question is how to use assembly Velvet output for the genome annotation? GeneMarks software takes a fasta as input and nothing else.

genome assembly • 1.3k views

ADD COMMENT • link updated 6.5 years ago by Kevin Blighe 87k • written 6.5 years ago by bioinformatics_bel ▴ 20

score 0 · Answer 1 · 2017-10-25

0

Entering edit mode

6.5 years ago

Kevin Blighe 87k

I don't believe that you require a huge amount of reads, certainly not more than one would normally get. With assembly, the key, as always, is to have long reads that can capture as much variation in the [Edit:] genome as possible. There is a direct inverse relationship of read-length on sensitivity and assembled genome size, as one would expect: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3988101/

Technically, you can assemble a [Edit:] genome from one sample but Velvet is designed to accept multiple samples combined, and these can have varying read lengths and insert sizes. The chosen k-mer size is important, too, and one can obviously choose a larger k-mer with longer reads.

I don't believe that Velvet does any annotation. Trinity does assembly and annotation, and Rockhopper is another tool that will identify novel transcripts that you can then 'annotate' indirectly via a BLASTx search.

Kevin

ADD COMMENT • link 6.5 years ago by Kevin Blighe 87k

0

Entering edit mode

thank you..............................

ADD REPLY • link 6.5 years ago by bioinformatics_bel ▴ 20

0

Entering edit mode

Apologies, I realised that you are referring to genome assembly, not transcriptome. The same idea applies, however, in reference to read lengths. Also, programs like Rockhopper and Trinity, which I mentioned above, are designed for transcritome assembly.

For genome assembly methods, including annotation, I would encourage you to read this: https://academic.oup.com/bioinformatics/article/30/19/2709/2422249/Evaluation-and-validation-of-de-novo-and-hybrid

Best of luck

Kevin

ADD REPLY • link 6.5 years ago by Kevin Blighe 87k