Question: What's a good PacBio CLR read simulator?
1
gravatar for pmarijon
8 months ago by
pmarijon30
pmarijon30 wrote:

What's a good PacBio CLR read simulator?

We could not use:

  • SiLiCO source doesn't generate quality values
  • FastqSim source has a bug where it outputs spurious A's and C's after 6 kbp
  • SimLord source outputs only CCS reads
  • pbsim source does not compile
  • readsim source - No control on quality value, we tried to assemble 60x E.coli simulated reads and obtained very fragmented assemblies with Canu 1.4

We were able to use:

  • LongIslnd source but is quite heavy-weight, it requires to install SMRTAnalysis (!) to generate model files. Model files aren't provided (this wasn't clear in the documentation), and it took over an hour to generate them myself. But helpful automated scripts were provided.

Did you know other PacBio CLR read simulator ?

simulator read • 430 views
ADD COMMENTlink modified 8 months ago by Brian Bushnell14k • written 8 months ago by pmarijon30
1

I think you have a good list, In my case I go for pbsim maybe you need to post the error here so someone could help you.

ADD REPLYlink modified 8 months ago • written 8 months ago by Medhat6.9k

Has you can see in this compile log it's a linking trouble, I think the build system forget some file.

ADD REPLYlink written 8 months ago by pmarijon30
1

did you read this or try the suggestions?

Some systems require unusual options for compilation or linking that the configure' script does not know about. Run ./configure --help for details on some of the pertinent environment variables.

You can give configure initial values for configuration parameters by setting variables in the command line or in the environment. Here is an example:

./configure CC=c99 CFLAGS=-g LIBS=-lposix

ADD REPLYlink modified 8 months ago • written 8 months ago by Medhat6.9k

This issue explain why the build system is broken and alternative solution to build pbsim. Thank

ADD REPLYlink written 8 months ago by pmarijon30
1

http://www.nature.com/nrg/journal/v17/n8/full/nrg.2016.57.html

if cannot access the paper use sci-hub or gen-lib

ADD REPLYlink written 8 months ago by stolarek.ir530

Thank,

I read this publication, I didn't test EAGLE but they have a trouble with boost when it's upper than 1.56

ADD REPLYlink written 8 months ago by pmarijon30

Not sure what the intended use case is but you want to simulate the reads from a specific genome? Otherwise enough original PacBio data is available now. PacBio makes several sets available here.

ADD REPLYlink written 8 months ago by genomax33k
1
gravatar for Brian Bushnell
8 months ago by
Walnut Creek, USA
Brian Bushnell14k wrote:

BBMap's RandomReads tool has a PacBio mode. BBMap is already compiled, so just unzip it and it will run if you have Java installed. Usage:

randomreads.sh ref=reference.fa out=reads.fq.gz reads=10000 minlength=500 maxlength=15000 pacbio=t pbmin=0.13 pbmax=0.17

That will generate 10000 reads with length ranging from 500bp to 15kbp and average error rate from 13% to 17%, following PacBio's typical pattern of relative sub, del, and ins frequencies and lengths.

ADD COMMENTlink written 8 months ago by Brian Bushnell14k
2

this worked well for me, and easy to install, thanks.

ADD REPLYlink written 6 months ago by Rayan Chikhi1.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 928 users visited in the last hour