Question: Metagenome Read Simulators
2
gravatar for dlawre14
15 months ago by
dlawre1430
United States
dlawre1430 wrote:

I've been doing some research into simulators for simulating metagenomes and I cannot find a good consensus on what to use. The two big ones I've seen are MetaSim and BEAR. Does anyone have experience with these or others or have any advice for a good meta-genome simulator?

metagenomics simulator reads • 916 views
ADD COMMENTlink modified 15 months ago by genomax47k • written 15 months ago by dlawre1430
2
gravatar for Brian Bushnell
15 months ago by
Walnut Creek, USA
Brian Bushnell15k wrote:

BBMap's simulator, randomreads.sh, has a metagenome mode; just add the flag "metagenome". E.g.

cat bug1.fa,bug2.fa,bug3.fa > bugs.fa
randomreads.sh ref=bugs.fa out=reads.fq reads=10m len=150 paired metagenome

It also has another tool, "mutate.sh", to create strains from a reference with slight differences. This can be useful when simulating metagenomes.

ADD COMMENTlink modified 15 months ago • written 15 months ago by Brian Bushnell15k
1

[As of BBmap v. 36.59] randomreads.sh gains the ability to simulate metagenomes.

coverage=X will automatically set "reads" to a level that will give X average coverage (decimal point is allowed).

metagenome will assign each scaffold a random exponential variable, which decides the probability that a read be generated from that scaffold. So, if you concatenate together 20 bacterial genomes, you can run randomreads and get a metagenomic-like distribution. It could also be used for RNA-seq when using a transcriptome reference.

The coverage is decided on a per-reference-sequence level, so if a bacterial assembly has more than one contig, you may want to glue them together first with fuse.sh before concatenating them with the other references.

ADD REPLYlink written 15 months ago by genomax47k
1

OMG, BBmap can do anything!

ADD REPLYlink written 15 months ago by shenwei3563.6k

I hadn't thought of BBMap... I swear that thing does everything now. Thank you!

ADD REPLYlink written 15 months ago by dlawre1430

@Brian Bushnell : I haven't used BBMap before. But can you please tell how it works when I'm trying to make a simulated metagenome from 10 whole bacterial geneomes and want different abundances of each genome in the metagenome. Thank you so much.

ADD REPLYlink written 9 months ago by chetana40

You would do something like this:

cat bacteria1.fa bacteria2.fa (and so forth) > all.fa
randomreads.sh ref=all.fa out=reads.fq len=150 paired reads=10000000 metagenome

Then it will generate reads with different coverage for each sequence in the reference.

ADD REPLYlink written 9 months ago by Brian Bushnell15k

Thanks for the reply Brian. Is there a way to know the coverage of each bacteria in my metagenome? Thank you.

ADD REPLYlink modified 9 months ago • written 9 months ago by chetana40

Is there anyway to run this so that it's species/isolate aware? The model would be more representative of real communities if it assigned probabilities to species/isolates.

ADD REPLYlink written 6 months ago by Andrew Tritt0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1019 users visited in the last hour