I would like to align the contigs from the recent  assembly of NA12878 to the latest human genome reference sequence (hg19). I have considered using BWA-SW, BLAT and LASTZ. I would greatly prefer to use the SAM/BAM format because it will facilitate my downstream analysis. However, BWA-SW prefers query sequences in the 1-2Mb range, while this assembly has contigs in the tens of megabases. LASTZ, on the other hand, is not well-suited for aligning to many chromosomes at once. BLAT is difficult because the PSL to BAM conversion is imperfect.
Has anyone done this?
If you were to do this, what tool would you use or how would you go about it?
 Gnerre et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA (2011) vol. 108 (4) pp. 1513-8
Probably you want to try this:
I would probably split long contigs into 1Mbp chunks and use BWA-SW (I actually wanted to do this but have not got time). By the way, they get tens of Mbp contigs? How long are scaffolds/supercontigs?
Perhaps also try this:
Just read the NA12878 paper. The contig N50 is 24kb. I would certainly map contigs rather than supercontigs.
Aaron, have you tried Mugsy (the one described by the link above)? As I read the paper just now, it may need tens of CPU days to align two human assemblies. For a 1000g request, I have mapped the NA12878 contigs using BWA-SW.