Question: How to run multiple alignment and SNP-calling of WGS data in .gb and .fasta using Python or Ruby/Java or any free software?
0
gravatar for fashiondesignrussian
3.9 years ago by
Belarus
fashiondesignrussian50 wrote:

How to run multiple alignment of WGS data in .gb and .fasta formats using Python or Ruby/Java? Please advise some packages and tutorials. I could not find a tutorial on multiple alignment and SNP calling using Python and Ruby. All I could do is to use trial DNAStar, RidomSeqSphere and NextGene software. Are there any free similar software and a way to do it with a modern language? Thank you, Folks.

sequencing alignment software • 1.9k views
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by fashiondesignrussian50

Google for "biopython", specifically tutorials related to multiple alignments, which I recall it can make. SNP calling is normally a different matter, though you could parse the multiple alignment if you really wanted. There may not be a tutorial for it, so just figure it out.

ADD REPLYlink written 3.9 years ago by Devon Ryan88k

I have BioPython and BioRuby, it is not enough, can you propose something more effective?

ADD REPLYlink written 3.9 years ago by fashiondesignrussian50

Are there any free analogues of the software I mentioned in my question?

ADD REPLYlink written 3.9 years ago by fashiondesignrussian50

The tools you mention are all GUI tools. You say you have BioPython and BioRuby but they are not enough (which is near impossible, seeing how they provide means to work with almost all bioinformatics cmd line tools). Quick question: How much programming experience do you have?

ADD REPLYlink written 3.9 years ago by RamRS20k

Yeah, if a GUI is needed then webtools should be used. There are web-based versions for many of the MSA tools.

Edit: Or there's Galaxy, which I presume also provides them.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Devon Ryan88k

But why use web-based tools in the first place when it's far more efficient and scalable to learn command line usage?

ADD REPLYlink written 3.9 years ago by RamRS20k

It's a question of how many times this needs to be done. If it's just a handful, then there's no point in bothering with any scripting or even the command line. If this needs to be done many times, then absolutely the command line or a specific script is needed.

ADD REPLYlink written 3.9 years ago by Devon Ryan88k

Either of those should be sufficient. There are biopython tutorials on creating MSAs. Anything after that you might have to code a bit yourself (or not, it'll depend on what you want to do). Biopython itself is using freely available tools for all of this (biopython is just a convenient wrapper in this case).
 

ADD REPLYlink written 3.9 years ago by Devon Ryan88k

Thanks, I know how to use Python and Ruby and functions from packages. I spent 7 years programming and learning computer sciences. I almost never ask questions on programming methodology and practice. For WGS with a 4,5 million nps it seems not to be the best option. Can you propose a better solution?

ADD REPLYlink written 3.9 years ago by fashiondesignrussian50

Ah, whole genome changes things completely. Most MSA programs are oriented toward proteins (that's what MSA what originally designed around). I'd be surprised if biopython provided any facilities for things of that magnitude. You'll likely need to write your own wrappers. See this thread for pointers on where to start: Help With Multiple Whole Genome Alignment. Aligning Over 400 Whole Genomes

ADD REPLYlink written 3.9 years ago by Devon Ryan88k

I have Bowtie, RBowtie, Mummer, Mugsy, I can`t say they are all good and easy enough to use  for my goals. Seems that all free is complete bullshit, or a partial one. I need a free analogue of RidomSeqSphere and NexGene software. That was my question. I don` like to push nails with a violin and flute instead of a heavy hammer.

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by fashiondesignrussian50
1

Bowtie etc. are short read aligners, you can't hope for them to produce whole genome MSAs. Please see the thread I linked to.
 

ADD REPLYlink written 3.9 years ago by Devon Ryan88k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2089 users visited in the last hour