Question: How to decide how many Iontorrent reads to run for contig assembly using Mira assembler?
gravatar for DanielC
2.2 years ago by
DanielC140 wrote:

Dear Friends,

I am running Mira contig assembler on a iontorrent sequenced bacteriophage fastq file. The total number of reads in the fastq file is about 1800000; the average read length in the fastq file is 300, and the reference genome is unknown. To run the program efficiently, I have divided the fastq files into chunks of reads like "fastq1.fastq: has 10000 reads" etc. At present, among the fastq files I generated from the main fastq file, I am experimenting how many reads fastq file will give a better resultant contig. Ideally the best result should be just 1 contig. Could you please tell me how many reads I should run (given the information I have as aforementioned) to get the best resultant contig? Thanks much!

ADD COMMENTlink modified 2.2 years ago by WouterDeCoster45k • written 2.2 years ago by DanielC140

Since you are working with a phage (assuming your DNA is pure phage) you are going to have a large amount of data which will oversample the DNA. Having too much coverage is not good to get good assemblies. You can either follow the method of incrementally adding reads or use a normalization method to intelligently look as the entire dataset at the same time.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by GenoMax96k

Thanks genomax! If I have to do normalization, then should I do it on the main fastq file with 180000 reads? or the fastq files generated from the main fastq files with reads like 10000, 20000 etc? I would really appreciate your suggestion on this and the rational behind the selection. Thanks much!

ADD REPLYlink written 2.2 years ago by DanielC140

Do the normalization with entire data.

ADD REPLYlink written 2.2 years ago by GenoMax96k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1004 users visited in the last hour