Question: Simulate Sff File
5
gravatar for Lee Katz
8.6 years ago by
Lee Katz2.9k
Atlanta, GA
Lee Katz2.9k wrote:

Hi, I am giving a workshop of genome assembly and I would like to have the students try genome assembly for themselves. However it will not be feasible to have tens of students performing assembly on a genome on the order of megabases. This is because it will likely be on either one server or on desktop computers, and there will be a time constraint. Is there a way to simulate an SFF for something smaller like a plasmid? Or simulate an SFF based on a neighborhood of a few operons? Thank you.

assembly simulation • 2.4k views
ADD COMMENTlink written 8.6 years ago by Lee Katz2.9k
3
gravatar for Istvan Albert
8.6 years ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

Rather than simulating an SFF (assuming you mean the 454's Standard Flowgram Format) you might be better off simulating sequences. On that topic there were some answers here: how-to-produce-simulated-synthetic-sequences

ADD COMMENTlink written 8.6 years ago by Istvan Albert ♦♦ 80k

I found a link to a link on that BioStar page, thank you. It shows how to simulate a genome. http://sourceforge.net/apps/mediawiki/dnaa/index.php?title=Whole_Genome_Simulation

ADD REPLYlink written 8.6 years ago by Lee Katz2.9k

Installation required many packages which were not listed in the documentation. After I installed everything, it gave a slew of errors in C, which I cannot debug. I'm not sure if this is the way to go.

ADD REPLYlink written 8.6 years ago by Lee Katz2.9k

http://www-ab.informatik.uni-tuebingen.de/software/metasim/welcome.html

MetaSim works

ADD REPLYlink written 8.6 years ago by Lee Katz2.9k
3
gravatar for lexnederbragt
8.6 years ago by
lexnederbragt1.2k
Oslo, Norway
lexnederbragt1.2k wrote:

Have you tried google? You will find at least this one:

Flowsim, http://blog.malde.org/index.php/flowsim/, paper here: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935434/

(http://lmgtfy.com/?q=454+sff+simulation)

ADD COMMENTlink written 8.6 years ago by lexnederbragt1.2k
2
gravatar for Pawel Szczesny
8.6 years ago by
Pawel Szczesny3.2k
Poland
Pawel Szczesny3.2k wrote:

Maybe you could use true data from traces archives, like SRA database (let's say a virus, like this one)? You can download fastq files (not sffs) but as far as I know Newbler can read fasta files with or without quality information (although it's possible that you would need to rescale quality scores in the first case).

ADD COMMENTlink written 8.6 years ago by Pawel Szczesny3.2k
2
gravatar for Bach
8.6 years ago by
Bach550
Bach550 wrote:

The new NCBI SRA format allows you to download their SRA archives and convert it to any of the more widely vendor formats used (SFF, FASTQ, Illumina) via their SRA Toolkit, see http://www.ncbi.nlm.nih.gov/books/NBK49294/ for download and manual.

So, search for "virus" or "plasmid" in the SRA (perhaps something like http://www.ncbi.nlm.nih.gov/sra/SRX025865?report=full), download the corresponding SRA, convert it to SFF and you're done.

Note 1: the 1.0b10 toolkit has one "error" admonished by current gcc which is quickly fixed. Note 2: using plasmid or virus libraries as example for assembly may be counter productive as these things tend to be really nasty as most of the time it's not one clean DNA which was sequenced but a mixture and that can confuse assemblers quite a lot.

ADD COMMENTlink written 8.6 years ago by Bach550

The NBK link didn't work - do you mean this one? http://www.ncbi.nlm.nih.gov/books/NBK47528/

ADD REPLYlink written 8.3 years ago by Ketil3.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 546 users visited in the last hour