Question: De novo assembly
0
gravatar for aida.shahraki
15 months ago by
aida.shahraki0 wrote:

I am interested in de novo assembly of illumina reads belong to an insect with a genome size about 300Mbp.Can anyone help me with the assembler program that I should use? Any manual?

sequencing assembly • 665 views
ADD COMMENTlink modified 15 months ago by Federico Lopez20 • written 15 months ago by aida.shahraki0
1

Best de novo assembler for insect genome ?

Minia is supposed to be a good choice if you have access to limited compute resources.

ADD REPLYlink modified 15 months ago • written 15 months ago by genomax67k

Please provide some more details on your data: coverage, is it genomic reads after all, PE insert size etc.?

ADD REPLYlink written 15 months ago by Michael Dondrup46k

Hi

When you what to perform assembly few parameters should be considered 1) what genome library (SE/PE/Mate-pair) 2)Insert size 3) Read length 4) Which sequencing platform 5)Quality of your raw reads.

In case if you have low coverage data you go for an assembler which works well for low coverage data.

You can go for popular K-mer construction deburjin graph based assemblers velvet, SOAPdenovo which are very popular and robust softwares. Gives you better N50 statistic.

All are command line, simple to use.

ADD REPLYlink modified 15 months ago • written 15 months ago by pinninti1991reddy30

Thank you very much. I did not receive the data yet. So, I don't know the error rate. Reads are 150bp PE and 100X coverage. Illumina non human HiSeq platform.

ADD REPLYlink written 15 months ago by aida.shahraki0
2
gravatar for Federico Lopez
15 months ago by
Federico Lopez20 wrote:

You usually need to run a few different assemblers and see what works best with your data. If you have 2x150bp reads from a single PCR-free library based on gel-free fragment selection, you could try DISCOVAR de novo. Although the DDN authors recommend 250 base reads, reads as short as 150 bases may work. Other options might be SPAdes, SGA, ABySS 2, Meraculous2, and MaSuRCA.

In my experience you can get medium-sized insect genome assemblies with good gene content and contiguity by correcting reads with BFC, assembling contigs with SPAdes (turning off its error correction module, BayesHammer), scaffolding with SGA, and fixing errors with Pilon. If your insect genome is actually much larger than 300Mbp, using SPAdes is probably not a good idea. Platanus is another option, specially if the genome is highly heterozygous, although in my experience you get very poor results with a single paired-end library; you would need reads from at least one mate-pair library. ALLPATHS-LG is another alternative if the paired-end reads overlap and you have at least one mate-pair library. If perhaps you can sequence long reads, you could try a hybrid assembly with SPAdes or other assemblers, too.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Federico Lopez20
0
gravatar for brs1111
15 months ago by
brs111110
brs111110 wrote:

spades (http://bioinf.spbau.ru/spades) might be a better option

ADD COMMENTlink written 15 months ago by brs111110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1029 users visited in the last hour