How To Generate Dna K-Mers Of A Given Length In C?
2
4
Entering edit mode
9.4 years ago
User 4000 ▴ 50

how to generate DNA k-mers of a given length? For example, if the k=2, there should be 4^2=16 k-mers.I need to know how we can do in C language.

c programming • 4.3k views
ADD COMMENT
0
Entering edit mode

can i know what here input is when i run this code the output was using main k size. how get output For example, if the k=2, there should be 4^2=16 k-mers.

ADD REPLY
0
Entering edit mode

Don't add answers unless you're answering the top level question. Use Add Comment or Add Reply instead as appropriate. When you say "this code", we have no idea what you're talking about. Reply to the appropriate answer with your comment.

ADD REPLY
0
Entering edit mode

IN this below when i run code it saying output as using main k size so here what is the input and i want output when k=1,2,3.

#include <stdio.h>
#include <stdlib.h>

#define SET_BASE(a) dna[pos]=a;recursive(dna,pos+1,k)

static void recursive(
    char* dna,
    const int pos,
    const int k)
    {
    if(pos==k)
        {
        fputs(dna,stdout);
        fputc('\n',stdout);
        return;
        }
    SET_BASE('A');
    SET_BASE('T');
    SET_BASE('G');
    SET_BASE('C');
    }

int main(int argc,char** argv)
    {
    char* dna;
    int k;
    if(argc!=2)
        {
        fprintf(stderr,"Usage %s k-size\n",argv[0]);
        return EXIT_FAILURE;
        }
    k=atoi(argv[1]);
    if(k<=0)
        {
        fprintf(stderr,"Bad k-size:%s\n",argv[1]);
        return EXIT_FAILURE;
        }
    dna=(char*)malloc(sizeof(char)*(k+1));
    if(dna==0)
        {
        fprintf(stderr,"Out of memory k=%s\n",argv[1]);
        return EXIT_FAILURE;
        }
    dna[k]=0;
    recursive(dna,0,k);
    free(dna);
    return EXIT_SUCCESS;
    }
ADD REPLY
0
Entering edit mode

Hi,

You were supposed to add a comment, not an answer. Please stop adding answers or your account will be temporarily suspended.

ADD REPLY
13
Entering edit mode
9.4 years ago
lh3 32k

For k<=31:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    int i, k;
    unsigned long long x, y;
    if (argc == 1) {
        fprintf(stderr, "Usage: %s <k>\n", argv[0]);
        return 1;
    }
    k = atoi(argv[1]);
    for (x = 0; x < 1ULL<<(2*k); ++x) {
        for (i = 0, y = x; i < k; ++i, y >>= 2)
            putchar("ACGT"[y&3]);
        putchar('\n');
    }
    return 0;
}
ADD COMMENT
1
Entering edit mode

Writing such things is part of my routine work. If you were working on alignment etc. on a daily base, you would certainly know these tiny tricks as well. You know a lot I do not know well. :)

ADD REPLY
0
Entering edit mode

Heng, why do I feel completely stupid each time you answer on biostar :-) ?

ADD REPLY
0
Entering edit mode

Writing such things is part of routine work. If you were working on alignment etc. on a daily base, you would certainly know these tiny tricks as well. You know a lot I do not know well. :)

ADD REPLY
0
Entering edit mode

I hope you'll receive the 2012 Franklin Award. You have my vote :-)

ADD REPLY
0
Entering edit mode

Thank you very much, Pierre. I am not sure if I have contributed enough, but I really hope biosciences to be open.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you, Pierre. :-)

ADD REPLY
6
Entering edit mode
9.4 years ago

Sounds like a homework, but I can't resist to code in C ;-)

Recursion is the way.

#include <stdio.h>
#include <stdlib.h>

#define SET_BASE(a) dna[pos]=a;recursive(dna,pos+1,k)

static void recursive(
    char* dna,
    const int pos,
    const int k)
    {
    if(pos==k)
        {
        fputs(dna,stdout);
        fputc('\n',stdout);
        return;
        }
    SET_BASE('A');
    SET_BASE('T');
    SET_BASE('G');
    SET_BASE('C');
    }

int main(int argc,char** argv)
    {
    char* dna;
    int k;
    if(argc!=2)
        {
        fprintf(stderr,"Usage %s k-size\n",argv[0]);
        return EXIT_FAILURE;
        }
    k=atoi(argv[1]);
    if(k<=0)
        {
        fprintf(stderr,"Bad k-size:%s\n",argv[1]);
        return EXIT_FAILURE;
        }
    dna=(char*)malloc(sizeof(char)*(k+1));
    if(dna==0)
        {
        fprintf(stderr,"Out of memory k=%s\n",argv[1]);
        return EXIT_FAILURE;
        }
    dna[k]=0;
    recursive(dna,0,k);
    free(dna);
    return EXIT_SUCCESS;
    }
ADD COMMENT

Login before adding your answer.

Traffic: 1480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6