Question: Does anyone use Python for variant calling?
0
gravatar for mhmtgenc85
2.3 years ago by
mhmtgenc8530
Turkey
mhmtgenc8530 wrote:

I have NGS raw data and would like to take that fastq file to VCF file by variant calling workflow. And in all of these steps I would like to use python. So which tools I can use to process my fastq file all the way to VCF and then annotate my variants. Thanks in advance. By the way I need to use python. That is my professors order :/

snp variant calling ngs python • 1.7k views
ADD COMMENTlink modified 2.3 years ago by Zaag640 • written 2.3 years ago by mhmtgenc8530
6

As much as I love python, that's a stupid requirement -_-;

Your Professor should be teaching you to find the best variant calling algorithm, not the one with the least number of curly brackets. Does subprocessing other tools count? :P

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by John12k
8

mypythonvariantcaller.py

import subprocess, shlex
subprocess.call(shlex.split('java -jar GenomeAnalysisTK.jar ... ...'))
ADD REPLYlink written 2.3 years ago by WouterDeCoster32k
1

With all do respect, questioning his professor's IQ level is neither the right way to solve the issue, nor the proper way to work with a boss!

Professors give orders all the time, your task is to search for the plausibility of the task and prepare scientific arguments why it would/wouldn't work (as in this case).

ADD REPLYlink modified 2.3 years ago • written 2.3 years ago by H.Hasani590
2

due* respect. While I agree with your second statement, John was not questioning the Prof's IQ - he was merely remarking that it was unlikely the Prof seriously meant to enforce a language constraint on reinventing the wheel for a thoroughly solved challenge - if you look at it closely, that does sound stupid.

ADD REPLYlink written 2.3 years ago by RamRS17k
1

What if this is a "learn python" (the hard way) exercise. Variant calling just happens to be an end point.

ADD REPLYlink written 2.3 years ago by genomax55k
5
gravatar for Devon Ryan
2.3 years ago by
Devon Ryan84k
Freiburg, Germany
Devon Ryan84k wrote:

You wouldn't do everything in python, that'd be a waste of CPU cycles and your time programming. Rather, you'd use something like snakemake to build a convenient python-based pipeline. It's quite likely that this is what your professor meant.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Devon Ryan84k
2
gravatar for WouterDeCoster
2.3 years ago by
Belgium
WouterDeCoster32k wrote:

Perhaps Platypus is a solution, a variant caller written (partially) in python: http://www.well.ox.ac.uk/platypus

I don't know what your position in this research is, but following your professor's orders is not scientifically correct, be critical and check alternatives. Most people use GATK AFAIK, so don't make it too hard on yourself by using something exotic.

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by WouterDeCoster32k
1
gravatar for vchris_ngs
2.3 years ago by
vchris_ngs4.5k
Seattle,WA, USA
vchris_ngs4.5k wrote:

I am totally in favor of what John is stating, if the requirement is to learn python and how to code in it , there is no point to re-invent it. You can make a processing script in python but then it comes with its own time frame and your professor should understand that. It will not be a new out of the box work , just a processing workflow but major part will be subprocesses calling BWA,GATK or other downstream variant annotation tools. Devon is correct about the wastage of CPU cycles as well. I would in that case look for a python framework processing script already built that employs my requirement and test it and show my result to the boss. That is how it will work, you have to deep learn what tools you need and what you are using at each and every step of variant calling and why you do use them. That is more important than any processing script emplying any scripting language unless you have a strict requirement of languages code of conduct at your work So take a look at the below link

variant_calling_pipeline

gatk_varcall

PHEnix

Enjoy!

ADD COMMENTlink written 2.3 years ago by vchris_ngs4.5k
1
gravatar for WouterDeCoster
2.3 years ago by
Belgium
WouterDeCoster32k wrote:

In addition, if you are going to do or have to do everything in python, will you write an aligner in python? I see you state that you start with fastq files.

ADD COMMENTlink written 2.3 years ago by WouterDeCoster32k
0
gravatar for mhmtgenc85
2.3 years ago by
mhmtgenc8530
Turkey
mhmtgenc8530 wrote:

Thanks for you all I know this professor situation is kind of weird. But as you suggest I could subprocess other tools and try an analyysis and show the results to my professor which might convince him. So could you suggest me tools which are written in python so that I can sttart with them?

My workflow will need first an aligner, like BWA, then a tool for manipulating bwa files to bam and bai format, like samtools, and bcftools to vcf format, lastly an annotator like SnpEff and annovar.

I hope ou can help me like previous answers of your. in the mean time I will try other links and ideas of yours, thanks again .

ADD COMMENTlink written 2.3 years ago by mhmtgenc8530

The ones I suggested above are python workflow itself. You can directly use that workflow and batch run it in a python process script for all the samples

ADD REPLYlink written 2.3 years ago by vchris_ngs4.5k
0
gravatar for Zaag
2.3 years ago by
Zaag640
Amsterdam
Zaag640 wrote:

Maybe have a look at this (scroll down for a BWA-samtools workflow example)

http://snakemake.bitbucket.org/snakemake-tutorial.html

From the intro: Snakemake offers a definition language that is an extension of Python with syntax to define rules and workflow specific properties

ADD COMMENTlink modified 2.3 years ago • written 2.3 years ago by Zaag640
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 722 users visited in the last hour