Is there any sample code with a tutorial on how to align a 1000 bp gyrase gene sequence against a database of them in Python or Java?
0
0
Entering edit mode
7.2 years ago
vassialk ▴ 200

Is there any sample code with a tutorial on how to align a 1000 bp (gyrase) gene sequence against a database of them, in Python or Java? Need to write a code to align an input sequence against a database of known sequences of several classes (100 items of each class) and output reports with variants and analysis charts. Can Biopython help in this task or I should search and use the other libraries or switch to R? Thank you.

sequence sequencing alignment SNP • 2.1k views
ADD COMMENT
0
Entering edit mode

I'm having a hard time understanding the question... Can you try to be clearer about what your input sequence and database of sequences look like?

ADD REPLY
0
Entering edit mode

Input -- cut DNAGyrase gene from tuberculosis Illumina NGS, database --- relevant genes with a known resistance status, thanks

ADD REPLY
0
Entering edit mode

Just a few more clarifying questions:

What is your goal after aligning the sequences, and do you need to use Python? It sounds to me like you already have a sequence you have constructed from your NGS data and now need to do a large-scale multiple alignment, perhaps for distance metrics like phylogenetic trees? Is that correct? Or are you looking to do a read mapping, with something like a BWA aligner to a known set of reference sequences?

ADD REPLY
0
Entering edit mode

I need to find differences between the input and database sequences and generate a meaningful nice report, thanks

ADD REPLY
0
Entering edit mode

So what you want is a variant caller then? And "a meaningful, nice report" is still super vague

ADD REPLY
0
Entering edit mode

If you want SNPs, then yes a variant calling pipeline like this one would be appropriate. If you want straight differences (global pairwise differences), then you'll want something like one of the clustal programs instead. In any case, there are already tools out there that do both of these things, so you shouldn't spend your time recoding it in Python unless you need to do that for a project of some sort.

ADD REPLY
0
Entering edit mode

Thank you, I`ll try that thing, though in such a case prefer to write my code with good libraries to control the process

ADD REPLY
0
Entering edit mode

I would suggest at least starting with the published pipeline and then modifying it or creating your own if the output doesn't make sense for your questions.

ADD REPLY
0
Entering edit mode

Thank you, the only way is to try several ways and see the results, I need Python or Java code with Bio[Language] libraries.

ADD REPLY

Login before adding your answer.

Traffic: 784 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6