Nucmer settings for mapping scaffolds to a reference
6.7 years ago
paul.bible ▴ 30

Hi, I have a genome of a wild plant subspecies in the form of about 200K scaffolds of various sizes from a few thousand bp to over 50kb. I am trying to assemble these into chromosomes using the the chromosome sequences of the nearest domesticated relative.

I am using nucmer from the mummer package (http://mummer.sourceforge.net/manual/) and trying to get the settings correct. I plan to use the tilings from nucmer -> show-tiling to construct the chromosome likely using biopython.

My questions are:
1) which settings for would be best for this task '-c' [min cluster] and '-l' [min match]?

Is it safe (meaningful) to concatenate all scaffolds in order of the tiling from nucmer? Reverse complementing when needed of course.

2) Are there any other programs designed to construct chromosomes from scaffolds given a reference? This seems like a routine/common task but I have not found much information on this specific problem.

3) After constructing the new chromosome what is the best way to call SNPs?

Thanks!

Assembly alignment SNP • 3.0k views
6.7 years ago
paul.bible ▴ 30

Ultimately I went with Lastz for the alignment from this paper's supplements. They aligned closely related rhesus species. The lastz process was very time consuming taking over a week for some of the chromosomes. But I guess its better to get it right.

This is the command I ended up using.

lastz Chr01.fasta scaffolds.fasta M=254 K=4500 L=3000 Y=15000 C=2 T=2 --format=axt > chr1.axt