Indexed on: 07 Jan '16Published on: 07 Jan '16Published in: Statistics - Machine Learning
Kronecker product kernel provides the standard approach in the kernel methods literature for learning from pair-input data, where both data points and prediction tasks have their own feature representations. The methods allow simultaneous generalization to both new tasks and data unobserved in the training set, a setting known as zero-shot or zero-data learning. Such a setting occurs in numerous applications, including drug-target interaction prediction, collaborative filtering and information retrieval. Efficient training algorithms based on the so-called vec trick, that makes use of the special structure of the Kronecker product, are known for the case where the output matrix for the training set is fully observed, i.e. the correct output for each data point-task combination is available. In this work we generalize these results, proposing an efficient algorithm for sampled Kronecker product multiplication, where only a subset of the full Kronecker product is computed. This allows us to derive a general framework for training Kronecker kernel methods, as specific examples we implement Kronecker ridge regression and support vector machine algorithms. Experimental results demonstrate that the proposed approach leads to accurate models, while allowing order of magnitude improvements in training and prediction time.