Hybrid Assembly From Solid, Illumina Data, And Ion Torrent Data
1
1
Entering edit mode
12.0 years ago
lin.barnum ▴ 230

Which assembly programs would be good for doing an assembly using data from Illumina, SOLiD and Ion torrent?

Also, it is generally not a good idea to convert from color space to basespace as any errors will propagate down the read, but is it okay to convert from basespace to colorspace since that is unambiguous and then I could possibly use the colorspace assembler from Velvet.

assembly solid illumina • 4.0k views
ADD COMMENT
3
Entering edit mode

It sounds reasonable that sequencing the same genome with multiple technologies should always be a good thing when aiming for a de novo assembly. However, that's not always the case and it may have been better to optimise the use of a single technology. To help assess whether this is the case there are several important considerations which should be taken into account before any sequencing has been carried out. The aim is to ensure each technology adds something useful to the final result. The main factors are accuracy, read length and insert size and sometimes coverage of difficult to sequence regions.

To help answer your question you should post an estimate of your genome size and ploidy. Additionally state what the read lengths are for each sequencing run of each technology you have, and the approximate depth coverage obtained based on the estimated genome size. Are any of the runs are paired-end, and if so, what is the estimated insert size?

Armed with that information an assembly strategy can be devised. I have to say that combining Illumina, SOLiD and Ion Torrent for de novo assembly is not a commonly seen strategy and may not be ideal, but this will depend on the exact nature of the data you have.

ADD REPLY
0
Entering edit mode

"not commonly seen" is a nice expression. I never saw it and would probably never think of doing it: all these technologies are more or less "short read" atm. Ion + Illumina could make sense by mixing Ion 200+bp reads with Illumina 100-150bp, one cancelling the artifacts of the other. I do not see the added value of SOLiD in there, I'd mix in something longer (454 or PacBio).

ADD REPLY
0
Entering edit mode

It is a haploid genome less than a GB in size. Illumina is paired-end. I have a 300 bp insert, and a 600 bp insert. The SOLiD is matepaired with a 1.5kb insert. The Ion Torrent data is minor in quantity compared to Illumina and SOLiD so I dont plan to rely on it much. There is about 30x coverage with Illumina and 20x or so with SOLiD.

ADD REPLY
0
Entering edit mode

Is this genomic? transcriptomic? Are you doing a de novo assembly or reference assembly? Base-space to color-space is fine.

ADD REPLY
0
Entering edit mode

This is genomic DNA. I am trying to do a denovo assembly.

ADD REPLY
0
Entering edit mode
12.0 years ago

Note that when you convert to colorspace your have far fewer options for tools and workflows - this becomes hindrance since the hard parts of assembly are scaffolding in finishing it.

I would also evaluate the data first by converting the SOLiD data to basespace filtering and triming it back very restrictively by quality. See what happens. Alternatively you could also try to map to one or more related genomes and convert to letterspace based on the mapping.

ADD COMMENT

Login before adding your answer.

Traffic: 1983 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6