What we are trying to achieve is to detect the existence of interested microbial species in patients' sequence sample. In order to evaluate specificity and sensitivity of our method, I am trying to simulate a set of testing data.
I have done some search and targeted at three software: GemSIM Grinder MetaSim
We have reference 16S ITS rRNA sequence data as well as some clinical sample.
The simulation data set ideally should contain various abundance of specified species. The simulated dataset should have MiSeq error rate, which I definitely can't find anywhere.
So far I have tested Grinder. Grinder can simulate PCR amplicon sequence when user provide amplicon sequence in fasta format. I do not know where to obtain amplicon sequence.
I am also running GemSim, which seems extremely time consuming when generating large dataset (160000 reads for paired end).
I wonder does any one here have experience with similar project? What software have you used? It is really hard to find similar discussions on-line. I thought it is time to start my own.
Thank you ahead for all the inputs.