Question

E coli Illumina paired assembly using a reference genome?

0

Entering edit mode

5.9 years ago

lg248 • 0

Hi, I'm a new bioinformatician and wondering if what I want is possible or even logical.

I sequenced (slightly) mutant E coli, genome should be approx 4.6GB. I used SPADES to do a de novo assembly from my paired sequencing but end with 2000 contigs and the largest is 25kb. My sequenced bacteria should be very similar to a published genome. Is there a way to use this published genome to help build my sequenced contigs/scaffolds rather than only using spades to do de novo assembly? Reading the spades manual didn't really clear it up for me.

spades.py -1 Merged_MG1655_runs1and2_R1.fastq.gz -2 Merged_MG1655_runs1and2_R2.fastq.gz -o Merged_MG1655_runs1and2_spades_output --only-assembler

Is what I was using. Thanks very much, any help much appreciated.

genome sequencing Assembly reference • 1.3k views

ADD COMMENT • link 5.9 years ago by lg248 • 0

1

Entering edit mode

What you are looking for is reference assisted genome assembly. There are a few suggestions in this thread: Tools and parameters for reference assisted eukaryotic genome assembly using a draft genome as reference These programs do need a de novo assembly so you are on the right track.

IDBA-Hybrid is another example.

ADD REPLY • link 5.9 years ago by GenoMax 152k

1

Entering edit mode

If the assembly is that bad by standard spades there may be a problem with the data. I would also suggest aligning reads to the related published reference sequence. This is always helpful in my experience. You can also map your assembled contigs to the reference sequence, then call structural variations, and finally inspect these contigs and calls very carefully.

I would actually expect you to have 100-200+ but not 2000 contigs from an Illumina paired-end de novo seq project.

ADD REPLY • link 5.9 years ago by colindaven 7.7k

0

Entering edit mode

I have used bwa mem to map my contigs back onto my reference genome and looked at it using Artemis. What do you mean call structural variations, and which program would you recommend for this?

ADD REPLY • link 5.9 years ago by lg248 • 0

1

Entering edit mode

4.6GB

Probably Mb?

ADD REPLY • link 5.9 years ago by WouterDeCoster 48k