Question: Should I use MiSeq or HiSeq to generate data for assembling the blowfly genome?
0
gravatar for leven001
4.4 years ago by
leven0010
United States
leven0010 wrote:

Hey!

I'm working with a blowfly genome (650M genome size). I have already used Ion Torrent PGM for sequencing but it only yielded about 2-3x coverage and around 4M usable reads (size select: 400, actually around 250, single end reads). I'm looking to sequence my samples on an Illumina platform but I don't know whether to use MiSeq or HiSeq. I am using the sequencing data to do de novo assembly (using CLCbio, V7) since there are no closely related genomes available (closest annotated genome would be Drosphila). Later on, I plan on using the assemblies to locate genes, microsatellites, transposable elements, etc.

What would be more useful: more coverage or longer reads? Any input would be great since I'm new to the bioinformatics field.

 

sequencing assembly genome • 6.9k views
ADD COMMENTlink modified 4.4 years ago by matted6.9k • written 4.4 years ago by leven0010

Someone correct me if I'm wrong, but I would assume longer reads would be more informative for de novo assembly.

ADD REPLYlink written 4.4 years ago by Katie D'Aco970
1

yes that is true, on the other hand having higher coverage helps a lot. So it is a tradeoff.

ADD REPLYlink written 4.4 years ago by Istvan Albert ♦♦ 77k
4
gravatar for matted
4.4 years ago by
matted6.9k
Boston, United States
matted6.9k wrote:

I assume you will make new libraries, and therefore aren't limited by the short fragment sizes you had before.

I don't agree with the other post saying that the MiSeq has higher error rates - the "official" word, other publications, and my own experience is that MiSeq is actually better (in terms of per-base accuracy, see e.g. here or here informally).

The tradeoff is read length (MiSeq wins) versus total coverage (HiSeq wins, for fixed cost).  For assembly and particularly looking at microsatellites and transposons I would definitely favor longer reads.

For concreteness, you could get 25M 300+300 PE reads from the MiSeq for $1800.  That's 8.3M bases per dollar, and one lane would give you 23X coverage.

On a HiSeq, you could get 200M 100+100 PE reads (though maybe some cores do longer?) for $2500.  That's 16M bases per dollar, and one lane would give you 62X coverage.

You can evaluate based on your budget and scientific goals, but personally I would do a MiSeq run.  You can start to get good assembly results at ~20X coverage, though you can increase that later as you want to close gaps and get longer contigs.

Caveats: these prices may not be the same for all providers and the read lengths, read totals, and prices are always changing... so this snapshot will probably become out-of-date soon.

ADD COMMENTlink written 4.4 years ago by matted6.9k
1

Good luck getting bigger than 600bp fragments with Nextera commonly used in MiSeq. It is very likely for the PE to overlap, so the price of 8.3M bases per dollar would have to account the fact that some base pairs will be redundant.

ADD REPLYlink written 4.4 years ago by Adrian Pelin2.2k
1
gravatar for Adrian Pelin
4.4 years ago by
Adrian Pelin2.2k
Canada
Adrian Pelin2.2k wrote:

I would pick HiSeq. I had some issues with MiSeq, namely higher error rate and uneven coverage of single copy regions.

ADD COMMENTlink written 4.4 years ago by Adrian Pelin2.2k

especially in the light of your short fragment sizes there is little benefit to be had from the longer MiSeq reads, your reads will overlap.

ADD REPLYlink written 4.4 years ago by Istvan Albert ♦♦ 77k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1499 users visited in the last hour