How To Generate Dna K-Mers Of A Given Length In C?
10.6 years ago
User 4000 ▴ 50

how to generate DNA k-mers of a given length? For example, if the k=2, there should be 4^2=16 k-mers.I need to know how we can do in C language.

c programming
can i know what here input is when i run this code the output was using main k size. how get output For example, if the k=2, there should be 4^2=16 k-mers.

IN this below when i run code it saying output as using main k size so here what is the input and i want output when k=1,2,3.

Hi,

10.6 years ago
lh3 33k

For k<=31:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
int i, k;
unsigned long long x, y;
if (argc == 1) {
fprintf(stderr, "Usage: %s <k>\n", argv[0]);
return 1;
}
k = atoi(argv[1]);
for (x = 0; x < 1ULL<<(2*k); ++x) {
for (i = 0, y = x; i < k; ++i, y >>= 2)
putchar("ACGT"[y&3]);
putchar('\n');
}
return 0;
}

Writing such things is part of my routine work. If you were working on alignment etc. on a daily base, you would certainly know these tiny tricks as well. You know a lot I do not know well. :)

Heng, why do I feel completely stupid each time you answer on biostar :-) ?

Entering edit mode

I hope you'll receive the 2012 Franklin Award. You have my vote :-)

Thank you very much, Pierre. I am not sure if I have contributed enough, but I really hope biosciences to be open.

Thank you, Pierre. :-)

Entering edit mode
10.6 years ago

Sounds like a homework, but I can't resist to code in C ;-)

Recursion is the way.

#include <stdio.h>
#include <stdlib.h>

#define SET_BASE(a) dna[pos]=a;recursive(dna,pos+1,k)

static void recursive(
char* dna,
const int pos,
const int k)
{
if(pos==k)
{
fputs(dna,stdout);
fputc('\n',stdout);
return;
}
SET_BASE('A');
SET_BASE('T');
SET_BASE('G');
SET_BASE('C');
}

int main(int argc,char** argv)
{
char* dna;
int k;
if(argc!=2)
{
fprintf(stderr,"Usage %s k-size\n",argv[0]);
return EXIT_FAILURE;
}
k=atoi(argv[1]);
if(k<=0)
{
return EXIT_FAILURE;
}
dna=(char*)malloc(sizeof(char)*(k+1));
if(dna==0)
{
fprintf(stderr,"Out of memory k=%s\n",argv[1]);
return EXIT_FAILURE;
}
dna[k]=0;
recursive(dna,0,k);
free(dna);
return EXIT_SUCCESS;
}