I'm going to be doing some non-linear regression (with a huge and messy residual function), and I am thinking of using PDL::Fit::LM (I had some trouble getting Levmar to install).
The explanatory variables for my fit are DNA sequence (which I'm feeding into a position-specific weight-matrix). What's the easiest way to put a DNA sequence into a piddle? Given that the function i'm working with is a big mess, performance is a consideration.
Since my weight-matrix is constrained so that the sum of weights at a given position comes to zero, my plan is currently to represent each nucleotide as a vector of three elements
A -> [1,0,0],
C -> [0,1,0],
G -> [0,0,1],
T -> [-1,-1,-1]. This way I can take a subsequence of my total sequence and just multiply it with my weight-matrix and get the score.