Question: Chloroplast genome: Need to close 2 gaps (~2Kbp), any suggestions?
gravatar for lizgzara
22 months ago by
USA/San Bruno/San Francisco State University
lizgzara10 wrote:

Hi all,

Currently I have 1 contig that contains 2 gaps, both gaps are approximately ~1,906bp collectively. This single contig is a collection of my data being run through multiple assemblers and stitched by hand. This is a chloroplast genome and I am trying to find a way to close these gaps and finalize my assembly. I have a significant amount of paired-end Miseq data (~300bp) and some PacBio data, although the PacBio data is less reliable due to shallow sequencing and errors.

Do you have any programs you would recommend? Techniques, protocols, papers? This project is my masters thesis and I would like to start running some analyzes on the complete chloroplast genome once I close these 2 gaps.

So far I have tried blasting my data to the surrounding regions of the gaps to try and extend the bp/sequence into the gaps, using sort of a "crawl-assembly" method but it hasn't worked. I also tried blasting the PacBio data onto the single contig (since PacBio data are huge pieces of sequence) but that hasn't worked.

Any suggestions would be greatly appreciated!

ADD COMMENTlink modified 22 months ago • written 22 months ago by lizgzara10

Have you tried to run Canu or other long read assembler with the PacBio data? You can then use the MiSeq data to polish the long read assembly errors. I have had good results using Abyss at a range of k-mer values to assemble a mitochondrial genome. If these methods fail, you should try conventional PCR, cloning and sanger sequencing to close the 2kbp gap.

ADD REPLYlink written 22 months ago by mark.ziemann1.1k

Hi @mark.ziemann, yes I did run my PacBio data through Canu and my MiSeq data was assembled via reference guided/ de novo assembly in Geneious. I then overlapped/overlay contigs from both Canu/Geneious results and that is how I ended up with the 1 contig with 2 gaps. I am wondering if there is any other program that might be able to use the large amount of high quality Miseq paired end data to seal the gaps, of course conventional PCR/cloning/sanger would work but I am trying to find a faster/less expensive/less time consuming way to close them. Right now I am looking into the literature from other papers that had to close gaps at the end of their assembly. Thank you for your help!

ADD REPLYlink written 22 months ago by lizgzara10

Have you tried GapFiller?

ADD REPLYlink written 22 months ago by Sej Modha4.1k

@Sej Modha, no I haven't but I will give it a try! Thank you so much for your recommendation :)

ADD REPLYlink written 22 months ago by lizgzara10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1572 users visited in the last hour