What Are The Open Source Alternatives To Gatk?
5
10
Entering edit mode
9.3 years ago
scottsmith1 ▴ 180

The Broad Institute has chosen to have the next version of gatk use "a mixed open/closed-source model": http://gatkforums.broadinstitute.org/discussion/17/gatk-2-0-announcement

What are the best alternatives for developers in the community that do not want to extend someone else's proprietary system?

gatk • 6.9k views
ADD COMMENT
1
Entering edit mode
9.3 years ago
vdauwera ★ 1.2k

FYI, the announcement and license model linked in the OP is 6 months out of date. Please see this announcement for the latest (more accurate) details: http://gatkforums.broadinstitute.org/discussion/2091/upcoming-changes-to-the-license-the-retirement-of-gatk-lite-by-v-2-4

Note that the GATK programming framework (engine, infrastructure and utility tools) remains fully open source under the MIT license. Developers are free to write their own tools on top of the GATK and distribute them without any restrictions. Only a subset of analysis tools are actually covered by the license restrictions.

ADD COMMENT
9
Entering edit mode

To clarify or reword, the most interesting bits of GATK are absolutely not open source, which by most accepted definitions requires free access for all uses (see http://www.fsf.org/about/what-is-free-software or http://opensource.org/faq for more). When you have to brag in your post "That’s right, free as in beer", you're accepting that you are not actually releasing open source software that would be free as in speech.

While the "framework" may be open source, the variant calling routines and other methods that are of the greatest interest to genetics practitioners are still encumbered by the restrictive licensing agreements.

ADD REPLY
0
Entering edit mode

That's correct, the key analysis tools are released under a proprietary license, as acknowledged in the announcement and ensuing discussion thread in the forum. My comment on the framework was intended for software developers, in keeping with the focus of the OP, rather than for genetic practitioners, who typically have different concerns. What constitutes the "most interesting bits of GATK" depends largely on whether you are a developer or a user.

ADD REPLY
5
Entering edit mode

That license is way too long for me to understand, but I have to say it does not look like an open source license. It looks like some sort of mixture where some things are under MIT, some are under Broad and some parts are fully closed.

From past history of other software packages I know that even trivially simple and clear licenses have endless corner cases that make the legality of various applications extremely murky.

ADD REPLY
0
Entering edit mode

The GATK license is indeed not an open source license; however it only covers a subset of the full package (ie not the framework). We are aware of the potential for confusion -- nobody likes reading long licenses full of legalese! So we are considering offering a separate package of the framework source, starting with release 2.4 (when the updated license comes into effect). Developers would be able to download this package with the clear understanding that everything in it is fully open under the MIT license -- no more murkiness there at least.

ADD REPLY
0
Entering edit mode

Thanks following up with details and clarifications. Much appreciated!

ADD REPLY
0
Entering edit mode

Glad to help.

FYI, we have just released v2.4 and as suggested earlier, we have created a separate github repo with the framework source, to make things entirely non-murky for developers.

https://github.com/broadgsa/gatk

ADD REPLY
1
Entering edit mode
7.0 years ago
SmallChess ▴ 580

FreeBayes is a good open-source alternative. There is no commercial restriction. The pipeline is simpler.

ADD COMMENT
1
Entering edit mode
5.0 years ago
vdauwera ★ 1.2k

Update: we are returning GATK to a fully open-source license (BSD 3-clause). For more details, see the press release from Broad and my post on the GATK blog.

ADD COMMENT
0
Entering edit mode
7.6 years ago

There is a recent duplicate of this question that has very interesting information:

Open Source Compliant Diploid Variant Caller comparable to GATK Haplotype Caller?

ADD COMMENT
0
Entering edit mode
5.0 years ago

GATK is a pretty big toolset, so it's hard to replace all of it in one fell swoop. But as far as variant-calling goes, you can use BBMap's "callvariants.sh" to get better results (from my tests) in a tiny fraction of the time. The BBMap package is fully open-source.

It's also easy to use - "callvariants.sh in=mapped.sam out=vars.vcf ref=ref.fa ploidy=2". I wrote it partly because alternatives were just too slow to seriously consider as part of a high-performance pipeline, and partly because they yielded incorrect results. I needed a fast program that gave correct variant calls and there just weren't any, so I wrote one myself.

ADD COMMENT

Login before adding your answer.

Traffic: 969 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6