Couple of months back, I sequenced (MiSeq) few BACs and assembled (Paired-end reads) using SOAPdenovo, but my assembly was fallen into many scaffolds. Now, I got reference genome of the same cultivar and trying to pull out my interested region (about 3.2 Mb).
Here is first approach:
I mapped my Paired-end reads on whole genome using BWA. By this approach, only 19 scaffolds of whole genome got mapped.
I blasted (blastn) SOAPdenovo assembly with whole genome (evalue: 1000, word size 40, percentage similarity: 100%). In this approach, more than 1500 scaffolds of whole genome got blast hits.
My question is, why this variation? Any problem with my mapping? Which is the best approach?
Or any other approach? Please share your experience guys!
EDIT: I am also thinking about reference guided re-assembly of my Paired-ends.