Question

Indel Detection For 454 Resequencing

7

Entering edit mode

13.8 years ago

User 59 13k

I do some work for a small diagnostics company that has a requirement for small indel detection in 454 data. Those of you familiar with Roche's pipeline will be aware that AVA (Amplicon Variant Analyzer) blissfully ignores indels <3bp. The 454 is also subject to issues with homopolymer runs.

I'd like to try some alternatives to AVA that focus more on indel than SNP discovery (although SNP discovery is still useful).

So far my inexhastive list of possibilities is:

Does anyone have any experience of these packages for indel calling with 454 data? Or any additional suggestions? We have looked at certain commercial packages, but they tend to come up slightly short on features, largely ones of scriptability/automation.

indel next-gen sequencing • 5.0k views

ADD COMMENT • link updated 5.6 years ago by Ram 43k • written 13.8 years ago by User 59 13k

Ram · Answer 1 · 2010-10-30

I'm the author of a variant detector, FreeBayes, which detects both SNPs and short insertions and deletions using BAM format alignment files. I've posted a note about this in another thread on indel detection.

In short, I strongly recommend you don't use GigaBayes, and instead use FreeBayes, which is a major improvement over GigaBayes in terms of interface, performance, and algorithm. (I need to update our documentation to this effect.)

FreeBayes can handle any insertion or deletion short enough to be spanned by a single read and represented in a single alignment record. If you want to detect of long insertions and deletions using 454 reads, you should also look into using Mosaik for your alignment step, as it can be configured to allow very long gaps alignments, although there is obviously a computational penalty for doing so. The insertion and deletion support of FreeBayes is still under development. I'm currently working to resolve some confusion about reporting them in the VCF as well as some algorithmic considerations.

Ram · Answer 2 · 2010-07-13

To get the ball rolling I actually started with the Variant Identification Pipeline. Whilst seemingly a good match from the paper it suffers from a number of issues.

Firstly the source code does not work out of the box from download, and I had to make code-level changes to remove hard-coded paths, the configuration file and it's subsequent use by the pipeline is very sensitive to missing/trailing slashes on paths, it relies on BioPerl modules deprecated in the 1.6.0 release (Bio::Tools::BPlite in this case) and uses sequence names as primary keys in the back end database in one case, meaning that you cannot re-run the pipeline on data that you have run through once, as it complains about primary keys already in use.

So I'm hoping for a tool a little bit more robust than this.

Ram · Answer 3 · 2010-10-28

2

Entering edit mode

13.5 years ago

David ▴ 20

We are also looking for software capable to detect not only SNPs but indels for diagnostic applications, on a 8 sample test run (BRCA1 and BRCA2) both VIP and AVA missed a single nucl insertion. As a next step we will look into 'segemehl' aligner (just one thing is inconvenient with it, it doesn't output SAM format)

ADD COMMENT • link 13.5 years ago by David ▴ 20

0

Entering edit mode

The company in question, I should point out, eventually went for a commercial solution from BioGene which is working very well in their hands, but not, of course, open source or cheap ;)

ADD REPLY • link updated 5.6 years ago by Ram 43k • written 13.5 years ago by User 59 13k

0

Entering edit mode

There are a lot of good and probably better SAM-supported aligners for 454 reads.

ADD REPLY • link 13.5 years ago by lh3 33k

score 2 · Answer 4 · 2010-12-02

2

Entering edit mode

13.4 years ago

Lhl ▴ 760

SWAP454 is designed for 454 sequencing.

ADD COMMENT • link 13.4 years ago by Lhl ▴ 760