This looks to be a very impressive tool for mapping BS-Seq data in base-space and colorspace. http://genomebiology.com/2012/13/10/R82
Figure 2 shows a comparison with some other aligners: http://genomebiology.com/2012/13/10/R82/figure/F2
BatMeth has a lower false+ rate with good speed and good true+
They evaluate on real data in base and colorspace, and on simulated data (and it's not their own simulator).
The method is outlined in figure 4: http://genomebiology.com/2012/13/10/R82/figure/F4
Briefly the process for a single read is:
- convert reference for + and -
- convert read for + and -
- check 4 possible mappings of read
- exclude read if it maps to > N possible locations
- compute number of mismatches and report its status as unique or not (only unique reads are used in calculation).
Since this is similar or identical to other BS-Seq mappers, it's not clear to me where they gain in accuracy. It must be that
- They discard low-complexity reads (based on shannon-entropy).
- They discard reads that are not unique