Question

randomreads.sh adding abundances for metagenomic like distribution

0

Entering edit mode

2.5 years ago

eozcan • 0

Hi,

I have 9 genomes, I would like to produce a metagenome like distribution using randomreads.sh. I concatenated genome fasta files in one reference file. Then, ran as below.

../bbmap/randomreads.sh ref=simplified_catgenome.fasta out1=20M.read1.fastq out2=20M.read2.fastq length=125 paired=t metagenome=t genome=9 reads=20000000

However, I would like to know if there is any way I can define the abundances for each genome based on qPCR results and then produce the reads accordingly. Would there be a way to produce reads from each genome separately with the absolute abundances?

Thank you!

randomreads.sh metagenome bbmap • 1.3k views

ADD COMMENT • link updated 2.5 years ago by GenoMax 141k • written 2.5 years ago by eozcan • 0

0

Entering edit mode

Can you not generate the reads independently and mix them as needed? You can then use shuffle.sh to mix the reads randomly giving you a representative metagenome.

ADD REPLY • link 2.5 years ago by GenoMax 141k

0

Entering edit mode

I can generate the reads independently. But does shuffle.sh have a function of indicating the abundances? I didnt see any!

ADD REPLY • link 2.5 years ago by eozcan • 0

0

Entering edit mode

I was thinking that you would add known amounts of reads together based on your needs and then simply shuffle them so they represent a mixed metagenome.

ADD REPLY • link 2.5 years ago by GenoMax 141k

0

Entering edit mode

Thats my question! How do I add known amount of reads? I will produce randomreads.sh from each genome, then shuffle them , right? But in which step exactly I am adding the known amount of reads for each genome.

ADD REPLY • link 2.5 years ago by eozcan • 0

score 0 · Answer 1 · 2021-10-12

0

Entering edit mode

2.5 years ago

GenoMax 141k

After you generate a certain number of reads for each genome using randomreads.sh. You can then use reformat.sh

Sampling parameters:


samplerate=1            Randomly output only this fraction of reads; 1 means sampling is disabled.
sampleseed=-1           Set to a positive number to use that prng seed for sampling (allowing deterministic sampling).
samplereadstarget=0     (srt) Exact number of OUTPUT reads (or pairs) desired.
samplebasestarget=0     (sbt) Exact number of OUTPUT bases desired.

to select desired number of reads from each genome (e.g. 1 M from genome_1, 1.2 M from genome_2 etc). cat the sampled genome files together and then shuffle.sh that file.

ADD COMMENT • link 2.5 years ago by GenoMax 141k

0

Entering edit mode

Ah thank you! That is what I was looking for!

ADD REPLY • link 2.5 years ago by eozcan • 0

0

Entering edit mode

If you generate the reads individually remember to turn the metagenome option off.

ADD REPLY • link 2.5 years ago by GenoMax 141k