Question: What's a good PacBio CLR read simulator?
3
gravatar for pmarijon
20 months ago by
pmarijon110
pmarijon110 wrote:

What's a good PacBio CLR read simulator?

We could not use:

  • SiLiCO source doesn't generate quality values
  • FastqSim source has a bug where it outputs spurious A's and C's after 6 kbp
  • SimLord source outputs only CCS reads
  • pbsim source does not compile
  • readsim source - No control on quality value, we tried to assemble 60x E.coli simulated reads and obtained very fragmented assemblies with Canu 1.4

We were able to use:

  • LongIslnd source but is quite heavy-weight, it requires to install SMRTAnalysis (!) to generate model files. Model files aren't provided (this wasn't clear in the documentation), and it took over an hour to generate them myself. But helpful automated scripts were provided.
  • BBMap's RandomReads source seems work, and easy to install (thanks to Brian Bushnell)
  • NPBSS source MATLAB OCTAVE seems work but maybe not support multi-line FASTA.

Did you know other PacBio CLR read simulator ?

Edit : add BBmap's RandomReads and NPBSS

simulator read • 933 views
ADD COMMENTlink modified 4 months ago • written 20 months ago by pmarijon110
1

I think you have a good list, In my case I go for pbsim maybe you need to post the error here so someone could help you.

ADD REPLYlink modified 20 months ago • written 20 months ago by Medhat7.9k

Has you can see in this compile log it's a linking trouble, I think the build system forget some file.

ADD REPLYlink written 20 months ago by pmarijon110
1

did you read this or try the suggestions?

Some systems require unusual options for compilation or linking that the configure' script does not know about. Run ./configure --help for details on some of the pertinent environment variables.

You can give configure initial values for configuration parameters by setting variables in the command line or in the environment. Here is an example:

./configure CC=c99 CFLAGS=-g LIBS=-lposix

ADD REPLYlink modified 20 months ago • written 20 months ago by Medhat7.9k

This issue explain why the build system is broken and alternative solution to build pbsim. Thank

ADD REPLYlink written 20 months ago by pmarijon110
1

http://www.nature.com/nrg/journal/v17/n8/full/nrg.2016.57.html

if cannot access the paper use sci-hub or gen-lib

ADD REPLYlink written 20 months ago by stolarek.ir550

Thank,

I read this publication, I didn't test EAGLE but they have a trouble with boost when it's upper than 1.56

ADD REPLYlink written 20 months ago by pmarijon110

Not sure what the intended use case is but you want to simulate the reads from a specific genome? Otherwise enough original PacBio data is available now. PacBio makes several sets available here.

ADD REPLYlink written 20 months ago by genomax57k
1
gravatar for Brian Bushnell
20 months ago by
Walnut Creek, USA
Brian Bushnell16k wrote:

BBMap's RandomReads tool has a PacBio mode. BBMap is already compiled, so just unzip it and it will run if you have Java installed. Usage:

randomreads.sh ref=reference.fa out=reads.fq.gz reads=10000 minlength=500 maxlength=15000 pacbio=t pbmin=0.13 pbmax=0.17

That will generate 10000 reads with length ranging from 500bp to 15kbp and average error rate from 13% to 17%, following PacBio's typical pattern of relative sub, del, and ins frequencies and lengths.

ADD COMMENTlink written 20 months ago by Brian Bushnell16k
2

this worked well for me, and easy to install, thanks.

ADD REPLYlink written 19 months ago by Rayan Chikhi1.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1648 users visited in the last hour