Should I use MiSeq or HiSeq to generate data for assembling the blowfly genome?
2
0
Entering edit mode
10.1 years ago
leven001 • 0

Hey!

I'm working with a blowfly genome (650M genome size). I have already used Ion Torrent PGM for sequencing but it only yielded about 2-3x coverage and around 4M usable reads (size select: 400, actually around 250, single end reads). I'm looking to sequence my samples on an Illumina platform but I don't know whether to use MiSeq or HiSeq. I am using the sequencing data to do de novo assembly (using CLCbio, V7) since there are no closely related genomes available (closest annotated genome would be Drosphila). Later on, I plan on using the assemblies to locate genes, microsatellites, transposable elements, etc.

What would be more useful: more coverage or longer reads? Any input would be great since I'm new to the bioinformatics field.

genome Assembly sequencing • 9.8k views
ADD COMMENT
0
Entering edit mode

Someone correct me if I'm wrong, but I would assume longer reads would be more informative for de novo assembly.

ADD REPLY
1
Entering edit mode

yes that is true, on the other hand having higher coverage helps a lot. So it is a tradeoff.

ADD REPLY
4
Entering edit mode
10.1 years ago
matted 7.8k

I assume you will make new libraries, and therefore aren't limited by the short fragment sizes you had before.

I don't agree with the other post saying that the MiSeq has higher error rates - the "official" word, other publications, and my own experience is that MiSeq is actually better (in terms of per-base accuracy, see e.g. here or here informally).

The tradeoff is read length (MiSeq wins) versus total coverage (HiSeq wins, for fixed cost). For assembly and particularly looking at microsatellites and transposons I would definitely favor longer reads.

For concreteness, you could get 25M 300+300 PE reads from the MiSeq for $1800. That's 8.3M bases per dollar, and one lane would give you 23X coverage.

On a HiSeq, you could get 200M 100+100 PE reads (though maybe some cores do longer?) for $2500. That's 16M bases per dollar, and one lane would give you 62X coverage.

You can evaluate based on your budget and scientific goals, but personally I would do a MiSeq run. You can start to get good assembly results at ~20X coverage, though you can increase that later as you want to close gaps and get longer contigs.

Caveats: these prices may not be the same for all providers and the read lengths, read totals, and prices are always changing... so this snapshot will probably become out-of-date soon.

ADD COMMENT
1
Entering edit mode

Good luck getting bigger than 600bp fragments with Nextera commonly used in MiSeq. It is very likely for the PE to overlap, so the price of 8.3M bases per dollar would have to account the fact that some base pairs will be redundant.

ADD REPLY
1
Entering edit mode
10.1 years ago
Adrian Pelin ★ 2.6k

I would pick HiSeq. I had some issues with MiSeq, namely higher error rate and uneven coverage of single copy regions.

ADD COMMENT
0
Entering edit mode

especially in the light of your short fragment sizes there is little benefit to be had from the longer MiSeq reads, your reads will overlap.

ADD REPLY

Login before adding your answer.

Traffic: 2448 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6