Modelling missing sequence-affinity data
0
0
Entering edit mode
6.2 years ago

Hi I have generated experimental data where the output is Kmer DNA sequence string of 12 and associated 'affinity' and tag count. Most Kmers DNA strings are not represented in the Affinity lists. Anyone have any suggestions of methods to model the affinities of missing Kmers?

example data

Kmer ObservedCount Probability ExpectedCount Affinity SE AGTGTAACGTGTC NA NA NA NA NA NA CGTGTAACGTGTC 130 3.6320387884448403e-6 991.7716666891613 0.032857923466624035 0.0040755238163352366 GGTGTAACGTGTC NA NA NA NA NA NA

R sequence • 852 views
ADD COMMENT
0
Entering edit mode

Why not programmatically generate every possible kmer of size 12 and test each? I have JAVA code that does this for any size. Granted, that would be a huge number of kmers.

ADD REPLY

Login before adding your answer.

Traffic: 2957 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6