Question: Wgsim Mutations In Output After Setting Everything To 0
1
gravatar for darxsys
6.0 years ago by
darxsys190
Croatia
darxsys190 wrote:

I was just wondering, is there any useful information on wgsim? Tutorial? Anything? I have been stuck with it for the last 2 weeks. I'm really not sure how to use it. I need it for a project of mine. For example, I downloaded a genome from NCBI. What I do is call wgsim like this:

./wgsim -e 0 -s 0 -N 1000 -1 30 -2 30 -r 0 -R 0 -X 0 -A 0 test_genome_one_row.fa read1.fa read2.fa

With this, I would expect that all reads would be the same as the parts of the genome since I set all its error parameters to 0. But somehow, I get reads with mutations(or something else, because they don't belong in the original genome.) What is going on in here and can somebody please explain wgsim's arguments and how can I really control its behaviour? Thanks!

paired-end • 1.9k views
ADD COMMENTlink modified 6.0 years ago by Istvan Albert ♦♦ 79k • written 6.0 years ago by darxsys190
1
gravatar for Istvan Albert
6.0 years ago by
Istvan Albert ♦♦ 79k
University Park, USA
Istvan Albert ♦♦ 79k wrote:

When you simulate with wgsim the read name will contain the genomic coordinates that were used to produce the read. Check that for the origin.

Note that the program is distributed both with samtools but can be also obtained separately from its own repository:

https://github.com/lh3/wgsim

This latter contains more features and is more up to date.

ADD COMMENTlink written 6.0 years ago by Istvan Albert ♦♦ 79k

Yes, I'm checking that with a Python script which, for some genome, tells me that 450/1000 reads are with mutations (don't belong in the original string). Yes, I obtained it from the repo. I'm just confused. Thanks for help!

ADD REPLYlink written 6.0 years ago by darxsys190

you can easily verify reads by mapping them agains the same gemome, the alignment will tell you where and and how it aligns and usually matches the name exactly,

ADD REPLYlink written 6.0 years ago by Istvan Albert ♦♦ 79k

I know. I just wanted to save time analysing alignments. I mean, if I tell wgsim: I want all errors to be zero, than a simple python script can just verify if that read is a part of the input string or not, which I'm doing right now. I'm basically without any use of this program if I can't make it behave the way I need it.

ADD REPLYlink written 6.0 years ago by darxsys190
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1219 users visited in the last hour