Question: E coli Illumina paired assembly using a reference genome?
0
gravatar for lg248
9 weeks ago by
lg2480
lg2480 wrote:

Hi, I'm a new bioinformatician and wondering if what I want is possible or even logical.

I sequenced (slightly) mutant E coli, genome should be approx 4.6GB. I used SPADES to do a de novo assembly from my paired sequencing but end with 2000 contigs and the largest is 25kb. My sequenced bacteria should be very similar to a published genome. Is there a way to use this published genome to help build my sequenced contigs/scaffolds rather than only using spades to do de novo assembly? Reading the spades manual didn't really clear it up for me.

spades.py -1 Merged_MG1655_runs1and2_R1.fastq.gz -2 Merged_MG1655_runs1and2_R2.fastq.gz -o Merged_MG1655_runs1and2_spades_output --only-assembler

Is what I was using. Thanks very much, any help much appreciated.

ADD COMMENTlink written 9 weeks ago by lg2480
1

What you are looking for is reference assisted genome assembly. There are a few suggestions in this thread: Tools and parameters for reference assisted eukaryotic genome assembly using a draft genome as reference These programs do need a de novo assembly so you are on the right track.

IDBA-Hybrid is another example.

ADD REPLYlink written 9 weeks ago by genomax73k
1

If the assembly is that bad by standard spades there may be a problem with the data. I would also suggest aligning reads to the related published reference sequence. This is always helpful in my experience. You can also map your assembled contigs to the reference sequence, then call structural variations, and finally inspect these contigs and calls very carefully.

I would actually expect you to have 100-200+ but not 2000 contigs from an Illumina paired-end de novo seq project.

ADD REPLYlink modified 9 weeks ago • written 9 weeks ago by colindaven1.8k

I have used bwa mem to map my contigs back onto my reference genome and looked at it using Artemis. What do you mean call structural variations, and which program would you recommend for this?

ADD REPLYlink written 9 weeks ago by lg2480
1

4.6GB

Probably Mb?

ADD REPLYlink written 9 weeks ago by WouterDeCoster41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 795 users visited in the last hour