Question: simulation of DNA-Seq paired-end data
1
gravatar for geek_y
4.1 years ago by
geek_y9.4k
Barcelona/CRG/London/Imperial
geek_y9.4k wrote:

I would like to simulate the dna-seq paired end data with common sequencing errors and snps (diploid organism).

But I do not want to do it on entire fasta file. I have already generated the fragments (300-500bp) using certain protocol. Now I want to generate paired end data from a set of fragments i.e read the fragment from both the ends and include error profile and SNPs so that I can validate the SNP caller I'm interested in. 

I would like to know if there is any easier way to do it. Otherwise I need to spend lot of time in writing it from scratch. 

dna-seq simulations • 1.6k views
ADD COMMENTlink modified 4.1 years ago by thackl2.6k • written 4.1 years ago by geek_y9.4k
0
gravatar for Dan D
4.1 years ago by
Dan D6.7k
Tennessee
Dan D6.7k wrote:

ART should be able to do what you need. When you said this:

But I do not want to do it on entire fasta file. I have already generated the fragments (300-500bp) using certain protocol. 

Perhaps I'm misunderstanding, but can't you just make a separate fasta file from these fragments, with each fragment as its own contig? ART can do amplicon-sequencing simulation as well, so maybe that's a feature you could take advantage of?

 

ADD COMMENTlink written 4.1 years ago by Dan D6.7k

Yeah. I have fasta file with fragments as short contains. I thought if I use any simulator, they again try to randomly fragment my contigs and generate fragments. I just want to skip that step.

I will check the amplicon module. Thanks.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by geek_y9.4k

Hi, unfortunately ART does not simulate SNPs. Its just a sequence data simulator.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by geek_y9.4k

@Dan D I am not able to simulate 300 bp illumina paired end reads using ART

ADD REPLYlink written 2.3 years ago by dhwani.dholakia0

Look into randomreads.sh from BBMap suite. It should be flexible and can simulate SNP, errors etc.

ADD REPLYlink written 2.3 years ago by genomax65k
0
gravatar for thackl
4.1 years ago by
thackl2.6k
MIT
thackl2.6k wrote:

Have a look at simNGS . It comes with two binaries, one creating PE fragments (which you already have) and one that generates reads from fragments (instead of a reference genome).

ADD COMMENTlink written 4.1 years ago by thackl2.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 685 users visited in the last hour