Question: Help understanding wgsim_eval.pl output
0
gravatar for Daniel Standage
4.5 years ago by
Daniel Standage3.9k
Davis, California, USA
Daniel Standage3.9k wrote:

I used wgsim to simulate some Illumina reads, and then I mapped the reads back to the original sequence using several different aligners. The wgsim code distribution includes a wgsim_eval.pl script for, I presume, evaluating alignment accuracy. I'm getting output like this...

06x            0 / 149888              149888  0.000e+00
05x            0 / 29                  149917  0.000e+00
04x            0 / 83                  150000  0.000e+00
03x            0 / 0                   150000  0.000e+00
02x            0 / 0                   150000  0.000e+00
01x            0 / 0                   150000  0.000e+00
00x            0 / 0                   150000  0.000e+00

and this.

04x            0 / 138592              138592  0.000e+00
03x            0 / 529                 139121  0.000e+00
02x           18 / 6503                145624  1.236e-04
01x           11 / 1580                147204  1.970e-04
00x          207 / 1202                148406  1.590e-03

As far as I can tell the wgsim documentation doesn't describe the output format. Can anyone explain this output to me?

bwa bam wgsim alignment • 1.6k views
ADD COMMENTlink modified 4.5 years ago by Devon Ryan91k • written 4.5 years ago by Daniel Standage3.9k
1
gravatar for Devon Ryan
4.5 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

The output format is kind of strange. Anyway, the first column is just the MAPQ. The second is the number of correct alignments at that MAPQ. This is followed by a "/" and then the total number of alignments at that MAPQ. The fourth value is a cumulative sum of the total number of alignments. The final column is ratio of the cumulative correct alignments over the cumulative total alignments.

Perhaps you need to tweak the -g setting to get correct results...unless the aligner is really just not handling the reads well. You can also manually check where wgsim_eval.pl is expecting something to align by just looking at the read name. The format is chromosome_startForward_startReverse, where startReverse is the start position if the alignment should have bit 0x10 set.

ADD COMMENTlink written 4.5 years ago by Devon Ryan91k

Hi Devon, Thanks for your patient explanation. Sorry to say I was still a little bit confused by wgsim_eval.pl. I have the following question: 1) when should I and how can I use the option -p or -g 2) how can we explain their outcomes? 3) which column should I focus on if I want to get the mapping accuracy? 04x 0 / 138592 138592 0.000e+00 03x 0 / 529 139121 0.000e+00 02x 18 / 6503 145624 1.236e-04 01x 11 / 1580 147204 1.970e-04 00x 207 / 1202 148406 1.590e-03 (Is "the final column is ratio of the cumulative correct alignments over the cumulative total alignments." as you said)? 4) And how can I calculate the mapping accuracy? Looking forward to your reply. He

ADD REPLYlink written 3.3 years ago by allrev0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1546 users visited in the last hour