I have genomic resequencing medical exome data (~4500) sequenced on an ion torrent. In terms of alignment, besides TMAP, any thoughts on samtools, bwa-mem, bowtie2, novalign, or other alignment software. Thank you :).
I've never dealt with Ion data, but from what I've read its main sequencing errors are indels (if not "main", at least they are common, unlike Illumina data). So you have to use a mapper which allows for indels, and maybe tweak the parameters to decrease gap penalty, and most probably realign later. There are some tools, such as PyroTools, and also protocols, specific for 454 / Ion data. This paper compare mappers on Ion data, against bacterial genomes though (spoiler: there is no clear "overall best mapper"). Finally, BBMap seems like a good fit as well.
I have an extensive experience dealing with Ion Torrent data and it is true that reads show high rate of homopolymer errors as suggested by h.mom. For some samples, 30% of reads (reference RNA-seq) require an indel to align against the reference genome. Ion proton system can be considered as a fancy pH meter that detects release of protons and decides which nucelotide has been added based on numbers of protons released. In the region where you have repeats of the same nucelotide (for example AAAAA), it is sometimes hard for it to resolve and it over or underestimates the real count. As a result, such reads need to be aligned using insertion or deletion depending on if the sequences over or under estimated the number of bases. I would avoid changing the scoring scheme of the alignment. For alignment, you should increase the edit distance because of the homopolymer errors and also because of the fact that reads are loner (around 150 bp) in length. You should also increase maximum insertions or deletions allowed in a read. Also increase the length of the biggest gap allowed. This has helped me.