Simulated overlapping paired end reads for 16S
0
0
Entering edit mode
6.8 years ago
David ▴ 230

Hi, I have a fasta files with 16S sequences from 20 different species. I would like to generate paired end reads (overlapping).

I was trying to use the following command form the bbmap package:

randomreads.sh ref=20_species_16S.fasta out=read.R1.fq.gz out2=read.R2.fq.gz paired reads=100000 length=250 mininsert=400 maxinsert=500

Would that randomly assign the same number of reads to each species in the ref file ?? If i have 20 species i would expect at least 5000 reads per species ?? Is that correct ??

bbmap • 1.3k views
ADD COMMENT
0
Entering edit mode

Tagging: Brian Bushnell

ADD REPLY
0
Entering edit mode

I think so, but if you want to be absolutely sure you could generate the reads independently for the 20 species and then use reformat.sh with samplereadstarget=5000 with each and then merge the resulting files. You would want to set maxinsert=450 if you want to ensure overlap.

ADD REPLY

Login before adding your answer.

Traffic: 2842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6