Blog: Fast kmer reverse complement code using bit tricks
gravatar for Rayan Chikhi
3.0 years ago by
Rayan Chikhi1.2k
France, Lille, CNRS
Rayan Chikhi1.2k wrote:

Today, the main developer of the GATB library (edrezen) emailed us the following bit of code. 

u_int64_t revcomp64_v2 (const u_int64_t& x, size_t sizeKmer)
    u_int64_t res = x;

    res = ((res>> 2 & 0x3333333333333333) | (res & 0x3333333333333333) <<  2);
    res = ((res>> 4 & 0x0F0F0F0F0F0F0F0F) | (res & 0x0F0F0F0F0F0F0F0F) <<  4);
    res = ((res>> 8 & 0x00FF00FF00FF00FF) | (res & 0x00FF00FF00FF00FF) <<  8);
    res = ((res>>16 & 0x0000FFFF0000FFFF) | (res & 0x0000FFFF0000FFFF) << 16);
    res = ((res>>32 & 0x00000000FFFFFFFF) | (res & 0x00000000FFFFFFFF) << 32);
    res = res ^ 0xAAAAAAAAAAAAAAAA;

    return (res >> (2*( 32 - sizeKmer))) ;

It computes the reverse complement of a k-mer stored in 64 bits, very efficiently. I don't recall seeing bit tricks applied to revcomp like that :)

Technical details: each nucleotide is encoded in two bits (A=00b; C=01b; G=11b; T=10b). Why this encoding? because it's fast to convert from ascii: (ascii_nucleotide>>1)&3. (G. Rizk found that.)

This code will be included in the next version of the GATB-Core library.

ADD COMMENTlink modified 3.0 years ago by Rob2.0k • written 3.0 years ago by Rayan Chikhi1.2k

that is pretty cool! 

ADD REPLYlink written 3.0 years ago by Istvan Albert ♦♦ 73k
gravatar for Rob
3.0 years ago by
United States
Rob2.0k wrote:

Cool!  Jellyfish uses a similar approach.  Specifically, check out mer_dna.hpp, where Guillaume has implemented a fairly generic version (allowing for different integer widths) of this idea!

ADD COMMENTlink written 3.0 years ago by Rob2.0k

oh indeed, almost identical! good find

ADD REPLYlink written 3.0 years ago by Rayan Chikhi1.2k

Yes, some pretty good code in that file ! And +1 for the static polymorphism of the mer_base class.

ADD REPLYlink modified 3.0 years ago • written 3.0 years ago by edrezen680
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 777 users visited in the last hour