How To Calculate --Score-Min In Bowtie2
1
0
Entering edit mode
10.5 years ago
Medhat 9.7k

I am trying to use Bowtie2 for mapping reads to reference genome. I will use global alignment. I need to define the --score-min which is not a number it is an equation that I need to solve but, I do not know how to form this equation my read length is 50 bp

info about the equation is Here

any idea?

mapping bowtie • 6.9k views
ADD COMMENT
0
Entering edit mode
10.5 years ago

The equation is generally pretty straight forward. The equation fed into that option is of the form Type,intercept,slope. Here is the C code that I use to calculate this value in Bison, where rlen is the length of a read and the various parts of the --score-min argument have been parsed and put in a struct called config:

int scoreMin(int32_t rlen) {
    //Return different values, depending on --score-min
    if(config.scoremin_type == 'L') {
        return (config.scoremin_intercept + config.scoremin_coef * rlen);
    } else if(config.scoremin_type == 'S') {
        return (config.scoremin_intercept + config.scoremin_coef * sqrt((float) rlen));
    } else if(config.scoremin_type == 'G') {
        return (config.scoremin_intercept + config.scoremin_coef * log((float) rlen));
    } else { //'C'
        return (config.scoremin_intercept + config.scoremin_coef);
    }
}

This matches how bowtie2 does the calculation. For paired-end reads, the value for both reads are added together. So, for the default end-to-end alignment with the default L,-0.6,-0.6 scoremin, a 50bp read would have a minimum score of -0.6-0.6*50=-30.6 (which becomes -30 upon casting to int). Of course, what that actually means in terms of edit distance can be changed by altering the mismatch/gap extension/etc. penalties. So think about what you really want to do if you go about changing this.

As an aside, changing --score-min will change the reported MAPQ for the reads, regardless of whether the mapping is actually changed or not (no, that's not documented!).

ADD COMMENT
0
Entering edit mode

First thanks for answering but, what i am asking when to use L or S or G when to change the number for B and A. I understand the calculation of the equation but i do not understand the parameters when to choose L over G and vise versa why choosing -.6 in global form why not -1 that what i am asking

ADD REPLY
1
Entering edit mode

If you really want to play around with those in a non-arbitrary way, then you need to have simulation data for whatever you're doing and then see how read length affects actual (as opposed to calculated) MAPQ. You can then pick an appropriate equation. That's a fair bit of work, of course, and not something I would really recommend.

ADD REPLY
0
Entering edit mode

+1 this is some how helping now

ADD REPLY

Login before adding your answer.

Traffic: 1959 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6