Question: Generate Uniform, Perfect Short Reads From Genome
gravatar for Raygozak
5.9 years ago by
State College, PA, Penn State
Raygozak1.3k wrote:

HI, i would like to generate short reads that cover all the genome and that have no mutations, indels, or other artifacts from a fasta file genome.

I tried wgsim with the following command line and it still generated reads with some indels

wgsim -e 0 -d 800 -N 700000 -1 270 -2 270 -r 0 -R 0 -X 0 alignments/SS52_FINAL.fasta ss52_1.fastq ss52_2.fastq


ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by Raygozak1.3k

why don't you try metasim? It's for metagenomics but can also generate reads from a single genome. And you can choose option that applies non errors, and uniform length

ADD REPLYlink written 5.9 years ago by matija.sosic80

I suggest you send a report to the developer (maybe a bug?). From what I see, seems like you correctly set to zero all the possible parameters introducing SNPs or indels. I don't know metasim, suggested by @matija.sosic, but it might be worth a try. Also, is it possible that for some strange reason some reads that could be mapped perfectly to a position are erroneously mapped to a different position with an indel? It shouldn't be possible, but it also depends on the aligner...

ADD REPLYlink written 5.9 years ago by Fabio Marroni2.4k

I verified that it is a bug of wgsim and not some aligner artifact.

ADD REPLYlink written 5.9 years ago by Raygozak1.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1416 users visited in the last hour