Speeding up NUCmer for de novo contig to reference alignment
2
1
Entering edit mode
8.2 years ago
lkw222 ▴ 30

I have a de novo assembled mammalian genome (~300,000 contigs) that I want to align to a closely related existing reference genome using NUCmer from the MUMmer package. Subsequently, given the Coords output of the NUCmer alignment, I want to use something like OSLay to order and stitch together the contigs.

The problem is, NUCmer seems like it is going to take a very, very long time to run. I have split the reference genome up into single chromosomes and I am aligning all of my contigs to each chromosome (32 chromosomes, 32 jobs running at once). However, even the smallest chromosome has run over a day and is not complete. Any ideas on how to speed up the alignment?

I'm using:

nucmer --prefix bbu_vs_bt_ref_chr1 ./bt_ref_chr1.fa ./WB_2.0.fa
sequencing Assembly alignment • 3.7k views
ADD COMMENT
0
Entering edit mode

Hello, Were you able to solve this problem? I am having the same trouble with my genome file. I have 16 chromosomes in the reference and 16 chromosomes in the query. I also have reduced the reference to 1 chromosome against 16 chromosomes in the query. But the time taken by the nucmer for each chromosome is like more than 1 day(still run has not completed). Any help would be appreciated.

ADD REPLY
1
Entering edit mode
8.2 years ago
sst ▴ 20

Never tried it, but http://bioinformatics.oxfordjournals.org/content/31/4/509.long claims to be an efficient drop-in replacement for MUMmer.

ADD COMMENT
0
Entering edit mode

just to improve google-ability of this page, the paper, in question here is "E-MEM: efficient computation of maximal exact matches for very large genomes "

ADD REPLY
0
Entering edit mode
8.2 years ago
ALchEmiXt ★ 1.9k
There is also a GPU optimized version you could try. How large is your reference and how large contigs? Monitor memory usage and increase memory... It needs to build a large suffix tree...
ADD COMMENT

Login before adding your answer.

Traffic: 2793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6