Question: Is there any sample code with a tutorial on how to align a 1000 bp gyrase gene sequence against a database of them in Python or Java?
0
gravatar for vassialk
4.5 years ago by
vassialk190
Belarus
vassialk190 wrote:

Is there any sample code with a tutorial on how to align a 1000 bp (gyrase) gene sequence against a database of them,  in Python or Java? Need to write a code  to align an input sequence against a database of known sequences of several classes (100 items of each class) and output reports with variants and analysis charts. Can Biopython help in this task or I should search and use the other libraries or switch to R ? Thank you.

sequencing snp alignment sequence • 1.4k views
ADD COMMENTlink written 4.5 years ago by vassialk190

I'm having a hard time understanding the question...  Can you try to be clearer about what your input sequence and database of sequences look like?

ADD REPLYlink written 4.5 years ago by Jautis280

Input -- cut DNAGyrase gene from tuberculosis Illumina NGS, database --- relevant genes with a known resistance status, thanks

ADD REPLYlink written 4.5 years ago by vassialk190

Just a few more clarifying questions:

What is your goal after aligning the sequences, and do you need to use Python?  It sounds to me like you already have a sequence you have constructed from your NGS data and now need to do a large-scale multiple alignment, perhaps for distance metrics like phylogenetic trees?  Is that correct?  Or are you looking to do a read mapping, with something like a BWA aligner to a known set of reference sequences?

ADD REPLYlink written 4.5 years ago by Steven Lakin1.4k

I need to find differences between the input and database sequences and generate a meaningful nice report, thanks

ADD REPLYlink modified 4.5 years ago • written 4.5 years ago by vassialk190

So what you want is a variant caller then? and a meaningful, nice report is still super vague

ADD REPLYlink written 4.5 years ago by Jautis280

If you want SNPs, then yes a variant calling pipeline like this one: http://bcb.io/2013/05/06/framework-for-evaluating-variant-detection-methods-comparison-of-aligners-and-callers/

would be appropriate.  If you want straight differences (global pairwise differences), then you'll want something like one of the clustal programs instead.  In any case, there are already tools out there that do both of these things, so you shouldn't spend your time recoding it in Python unless you need to do that for a project of some sort.

ADD REPLYlink written 4.5 years ago by Steven Lakin1.4k

thank you, I`ll try that thing, though in such a case prefer to write my code with good libraries to control the process

ADD REPLYlink written 4.5 years ago by vassialk190

I would suggest at least starting with the published pipeline and then modifying it or creating your own if the output doesn't make sense for your questions. 

ADD REPLYlink written 4.5 years ago by Jautis280

thank you, the only way is to try several ways and see the results, I need Python or Java code with Bio[Language] libraries.

ADD REPLYlink written 4.5 years ago by vassialk190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1777 users visited in the last hour