Question: ion torrent alignment software
0
gravatar for bioguy24
4.5 years ago by
bioguy24190
Chicago
bioguy24190 wrote:

I have genomic resequencing medical exome data (~4500) sequenced on an ion torrent. In terms of alignment, besides TMAP, any thoughts on samtools, bwa-mem, bowtie2, novalign, or other alignment software. Thank you :).

alignment ngs • 3.3k views
ADD COMMENTlink modified 4.5 years ago by h.mon29k • written 4.5 years ago by bioguy24190

My very first thought is SAMtools sucks at performing alignments.

ADD REPLYlink written 4.5 years ago by h.mon29k
Ok any other thoughts or publications or user experiences? Thank you :)
ADD REPLYlink written 4.5 years ago by bioguy24190

Bowtie2 or bwa-mem.

ADD REPLYlink written 4.5 years ago by lh332k

Thank you :)

ADD REPLYlink written 4.5 years ago by bioguy24190
3
gravatar for h.mon
4.5 years ago by
h.mon29k
Brazil
h.mon29k wrote:

I've never dealt with Ion data, but from what I've read its main sequencing errors are indels (if not "main", at least they are common, unlike Illumina data). So you have to use a mapper which allows for indels, and maybe tweak the parameters to decrease gap penalty, and most probably realign later. There are some tools, such as PyroTools, and also protocols, specific for 454 / Ion data. This paper compare mappers on Ion data, against bacterial genomes though (spoiler: there is no clear "overall best mapper"). Finally, BBMap seems like a good fit as well.

 

ADD COMMENTlink written 4.5 years ago by h.mon29k

Thank you :).

ADD REPLYlink written 4.5 years ago by bioguy24190
4
gravatar for Ashutosh Pandey
4.5 years ago by
Philadelphia
Ashutosh Pandey12k wrote:

I have an extensive experience dealing with Ion Torrent data and it is true that reads show high rate of homopolymer errors as suggested by h.mom.  For some samples, 30% of reads (reference RNA-seq) require an indel to align against the reference genome. Ion proton system can be considered as a fancy pH meter that detects release of protons and decides which nucelotide has been added based on numbers of protons released. In the region where you have repeats of the same nucelotide (for example AAAAA), it is sometimes hard for it to resolve and it over or underestimates the real count. As a result, such reads need to be aligned using insertion or deletion depending on if the sequences over or under estimated the number of bases. I would avoid changing the scoring scheme of the alignment. For alignment, you should  increase the edit distance because of the homopolymer errors and also because of the fact that reads are loner (around 150 bp) in length. You should also increase maximum insertions or deletions allowed in a read. Also increase the length of the biggest gap allowed. This has helped me. 

ADD COMMENTlink written 4.5 years ago by Ashutosh Pandey12k

Ashutosh Pandey do you mind if I email you offline to discuss a bit more? The lab has been running Ion Torrent for about 2 years now and we are moving towards exome next. Thank you :).

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by bioguy24190
1

Please feel free to email me at ashutoshmits at gmail. We have mostly used it for RNA-seq so the tricks that I talked above worked because our goal was to increase the mapping efficiency. I am not sure how allowing more more errors during alignment to increase the alignment rate would affect the downstream variant calling results. In our case we only use uniquely aligned reads for quantification of expression. The good part with longer reads is that even if you are little liberal with alignment you may still be able to align reads uniquely. I think you will have to perform a thorough filtering on your vcf files. I may or may not be making sense right now but we can talk about it over email.    

ADD REPLYlink written 4.5 years ago by Ashutosh Pandey12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 680 users visited in the last hour