I want to run Kernel Ridge Regression on a set of kernels I have computed, but I do not know how to do this in R. I found the constructKRRLearner function from CVST package, but the manual is not clear at all, especially for me being a complete beginner in Machine Learning. The function needs and x and y, but I have no idea what to input there, as I only have a data frame that has the pairwise kernel computed as kronecker product between drugs and proteins.
How can I do a Kernel Ridge Regression task in R?
Ideally I also want to visualize my data points and then illustrate the regression line on the plot! For instance like this:
MORE INFO ON MY DATASET
I have a drug-target interactions (DTI) data set. The data set comprises of 100 drug compounds (rows) and 100 protein kinase targets (columns). there are some NAN's (missing values) in this data set. Values in this data set reflect how tightly a compound binds to a target.
I have drugs' SMILES and CHEMBL IDs.
I have the protein's (targets) sequences and UNIPROT IDs.
For drugs [100 drugs]: I converted drug SMILES to SDFset, and then I computed the fingerprints for each drug using OpenBabel. Based on these fingerprints I computed Tanimoto kernels for all possible combinations between drugs. (using "fpSim" function), e.g. Drug 1 with Drug 2, 3, 4, ... 10. Then Drug 2 with Drug 1, 3, 4... 100 and so on until Drug 99 with Drug 100. I named this BASE_DRUG_KERNELS
For proteins: I had the protein sequences, so I computed Smith-Waterman scores for all combination of protein pairs; e.g. Protein 1 with Protein 2, 3, ... 100, then Protein 2 with Protein 1, 3, 4, ... 100 and so on until Protein 99 with Protein 100. I named this BASE_PROTEIN_KERNELS
Then I computed the Kronecker between BASE_DRUG_KERNELS and BASE_PROTEIN_KERNELS which gave me a matrix of 100,000,000 elements. I named this matrix KRONECKER_PRODUCTS
I wish to run Kernel Ridge Regression on the matrix KRONECKER_PRODUCTS.