7.0 years ago by
Washington University School of Medicine, St. Louis, USA
If you want to align RNA-seq reads with an aligner designed for DNA mapping you can create a custom reference genome that contains the chromosomes plus a database of exon-exon junctions. This deals with the issue of mapping reads across exon-exon junctions, but it limits your detection to those junctions that you define in your database. Several groups have used this approach to align RNA-seq reads with BWA and other aligners that are not 'splice-aware'. If your reads are long (>75 bp), the simplest solution is probably just to use TopHat to align your reads against a standard reference genome. It will definitely do a better job than simply using SOAP2 against the standard reference genome. In addition to TopHat there are many other 'splice-aware' aligners designed with RNA-seq reads in mind. These include: TopHat, SpliceMap, MapSplice, hmmSplicer, Supersplat, SOAPsplice, etc.
Of course there are also many other older splice aware aligners that would produce useful results but are too slow to be practical when aligning the number of reads typical in an RNA-seq experiment. These include: BLAT, Exonerate, Spidey, Splign, etc. They might still be useful in the context of performing an evaluation of the next-gen splice-aware aligners with a test data set.