Question: Hybrid Assembly From Solid, Illumina Data, And Ion Torrent Data
1
gravatar for lin.barnum
7.0 years ago by
lin.barnum230
lin.barnum230 wrote:

Which assembly programs would be good for doing an assembly using data from Illumina, SOLiD and Ion torrent?

Also, it is generally not a good idea to convert from color space to basespace as any errors will propagate down the read, but is it okay to convert from basespace to colorspace since that is unambiguous and then I could possibly use the colorspace assembler from Velvet.

assembly solid illumina • 2.6k views
ADD COMMENTlink written 7.0 years ago by lin.barnum230
3

It sounds reasonable that sequencing the same genome with multiple technologies should always be a good thing when aiming for a de novo assembly. However, that's not always the case and it may have been better to optimise the use of a single technology. To help assess whether this is the case there are several important considerations which should be taken into account before any sequencing has been carried out. The aim is to ensure each technology adds something useful to the final result. The main factors are accuracy, read length and insert size and sometimes coverage of difficult to sequence regions.

To help answer your question you should post an estimate of your genome size and ploidy. Additionally state what the read lengths are for each sequencing run of each technology you have, and the approximate depth coverage obtained based on the estimated genome size. Are any of the runs are paired-end, and if so, what is the estimated insert size?

Armed with that information an assembly strategy can be devised. I have to say that combining Illumina, SOLiD and Ion Torrent for de novo assembly is not a commonly seen strategy and may not be ideal, but this will depend on the exact nature of the data you have.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by Nick Loman610

"not commonly seen" is a nice expression. I never saw it and would probably never think of doing it: all these technologies are more or less "short read" atm. Ion + Illumina could make sense by mixing Ion 200+bp reads with Illumina 100-150bp, one cancelling the artifacts of the other. I do not see the added value of SOLiD in there, I'd mix in something longer (454 or PacBio).

ADD REPLYlink written 6.9 years ago by Bach540

It is a haploid genome less than a GB in size. Illumina is paired-end. I have a 300 bp insert, and a 600 bp insert. The SOLiD is matepaired with a 1.5kb insert. The Ion Torrent data is minor in quantity compared to Illumina and SOLiD so I dont plan to rely on it much. There is about 30x coverage with Illumina and 20x or so with SOLiD.

ADD REPLYlink modified 6.9 years ago • written 6.9 years ago by lin.barnum230

Is this genomic? transcriptomic? Are you doing a de novo assembly or reference assembly? Base-space to color-space is fine.

ADD REPLYlink written 7.0 years ago by Damian Kao15k

This is genomic DNA. I am trying to do a denovo assembly.

ADD REPLYlink modified 7.0 years ago • written 7.0 years ago by lin.barnum230
0
gravatar for Istvan Albert
7.0 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

Note that when you convert to colorspace your have far fewer options for tools and workflows - this becomes hindrance since the hard parts of assembly are scaffolding in finishing it.

I would also evaluate the data first by converting the SOLiD data to basespace filtering and triming it back very restrictively by quality. See what happens. Alternatively you could also try to map to one or more related genomes and convert to letterspace based on the mapping.

ADD COMMENTlink written 7.0 years ago by Istvan Albert ♦♦ 79k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1374 users visited in the last hour