To Simulate Heterozygous Variations In Genome
2
2
Entering edit mode
12.3 years ago
Pascal ★ 1.5k

A bit technical question here... I want to simulate an heterozygous indel starting from a piece of reference genome (FASTA file) and I thought about the following workflow to do it. Could you comment it? Is it a good idea or is there a better approach?

  1. copy of reference file (let's call it ref.fasta) to ref_insert.fasta

  2. edit ref_insert.fasta and insert a sequence of 10bp for instance,

  3. concatenate both fasta files ref.fasta into a new fasta file (ref_diploid.fasta)

  4. simulate short reads with a simulator (my favorite one is wgsim) disabling mutations generator,

  5. align and then detect variants.

variant genome • 3.2k views
ADD COMMENT
3
Entering edit mode
12.3 years ago

You can start this way, but then I would leave the mutation generation on (your sample will never be exactly like the reference genome). Also, you can insert a few of different size and/or in random locations, so that you get an idea of how likely it is to pick it up.

ADD COMMENT
0
Entering edit mode

Of course, you're right. Good point, thank you Stefano.

ADD REPLY
1
Entering edit mode
12.3 years ago

At Omixon we developed tools for simulating reads with various mutations. The input is a multi-fasta reference file and a VCF file. It can simulate Illumina and Ion Torrent reads with specified length and distance in case of paired data. We offer it next month as a free on-line service, but if you do not want to wait we can give you a command line version for personal use. you can fill the contact form on www.omixon.com.

ADD COMMENT

Login before adding your answer.

Traffic: 1946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6