# Downloads

## Shark-Library

The Shark machine learning library is a modular C++ library for the design and optimization of adaptive systems.The library provides methods for regression, classification, and density estimation, including various kinds of neural networks and kernel methods, as well as general algorithms for nonlinear optimization, in particular single- and multi-objective evolutionary algorithms and gradient-based methods.

The most recent version of Shark can be downloaded from here.

### Video lecture

Tobias Glasmachers' 2008 NIPS open source software workshop video lecture on Shark.### Paper describing the library

*Journal of Machine Learning Research*

## Maximum Likelihood Model Selection for SVMs

Adapting the hyperparameters of support vector machines (SVMs) is a challenging model selection problem, especially when flexible kernels are to be adapted and data are scarce. We present a coherent framework for regularized model selection of 1-norm soft margin SVMs for binary classification. We propose to use gradient-ascent on a likelihood function of the hyperparameters. The likelihood function is based on logistic regression for robustly estimating the class conditional probabilities and can be computed efficiently. Overfitting is an important issue in SVM model selection and can be addressed in our framework by incorporating suitable prior distributions over the hyperparameters.### Source code

Download example code for Shark version 2.### Paper describing the algorithm

*IEEE Transactions on Pattern Analysis and Machine Intelligence*

## Training of Large-scale SVMs

Support vector machines are trained by solving constrained quadratic optimization problems. This is usually done with an iterative decomposition algorithm operating on a small working set of variables in every iteration. The training time strongly depends on the selection of these variables. We propose the maximum-gain working set selection algorithm for large scale quadratic programming. It is based on the idea to greedily maximize the progress in each single iteration. The algorithm takes second order information from cached kernel matrix entries into account. We proved the convergence to an optimal solution of a variant termed hybrid maximum-gain working set selection. This method has been empirically compared to the prominent most violating pair selection and the latest algorithm using second order information. For large training sets our new selection scheme is significantly faster.### Source code

Hybrid Maximum Gain working set selection algorithm implemented into LIBSVM 2.71. It is also implemented in the Shark machine learning library.### Supplementary material

This program checks all 2,834,352 cases considered for the proof of Lemma 8. However, in the meantime Tobias Glasmachers found a much more elegant proof:

*16th European Symposium on Artificial Neural Networks (ESANN 2008)*. Evere, Belgien: d-side publications, 2008

### Paper describing the algorithm

*Journal of Machine Learning Research*

## Online and Active Learning for SVMs

Iterative learning algorithms that approximate the solution of support vector machines (SVMs) have two potential advantages. First, they allow for online and active learning. Second, for large datasets computing the exact SVM solution may be too time consuming and an efficient approximation can be preferable. The powerful LASVM proposed by A. Bordes, S. Ertekin, J. Weston, and L. Bottou iteratively approaches the exact SVM solution using sequential minimal optimization (SMO). It allows for efficient online and active learning. This algorithm can be considerably improved in speed and accuracy by replacing the working set selection in the SMO steps. We incorporated a second order working set selection strategy, which greedily aims at maximizing the progress in each single step.### Supplementary material

The official LASVM website can found here. The modified LASVM source code implementing our working set selection strategy can be downloaded here.### References

*Journal of Machine Learning Research*

*Neural Computation*

## Gradient-based Optimization of Kernel-Target Alignment for Sequence Kernels Applied to Bacterial Gene Start Detection: Sequence Data

Extracting protein-encoding sequences from nucleotide sequences is an important task in bioinformatics. This requires to detect locations at which coding regions start. These locations are called translation initiation sites (TIS).

The TIS2007 data is a reliable data set designed to evaluate machine learning algorithms for automatic TIS detection. It is based on E. coli genes from the EcoGene database. Only entries with biochemically verified N-terminus were considered. The neighboring nucleotides were looked up in the GenBank file U00096.gbk . From the 732 positive examples associated negative examples were created. For the negative examples, sequences centered around a potential start codon were considered and accepted them if the codon is in-frame with one of the real start sites used as a positive case, its distance from a real TIS is less than 80 nucleotides, and no in-frame stop codon occurs in between. This data selection generates a difficult benchmark because the negative TISs in the data set are both in-frame with and in the neighborhood of the real TIS. Finally a set of 1248 negative examples was obtained. The length of each sequence is 50 nucleotides, with 32 located upstream and 18 downstream including the start codon.

To minimize sampling effects, 50 different partitionings of the data into training and test sets were generated. Each training set contains 400 sequences plus the associated negatives, the corresponding test set 332 sequences plus the associated negatives. Each line in a data file starts with the label, 1 or -1 for positive and negative examples, respectively, followed by the nucleotide sequence as ASCII string.

### Source code

Gradient-based optimization of the kernel-target alignment is available as a standard SVM model selection method in the Shark library. For an implementation of various oligo kernel functions for the Shark library click here.### Supplementary material

All 50 data partitionings in training and test set.### References

*IEEE/ACM Transactions on Computational Biology and Bioinformatics*

*International Journal of Neural Systems*

## $The\; Elitist\; Covariance\; Matrix\; Adaptation\; Evolution\; Strategy$

The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued optimization. We propose the an elitist version of this algorithm. For step size adaption, the algorithms used an improved 1/5-th success rule, which replaces the cumulative path length control in the standard CMA-ES.

We developed an incremental Cholesky update for the covariance matrix replacing the computational demanding and numerically involved decomposition of the covariance matrix. This rank-one update can replace the decomposition only for the update without evolution path and reduces the computational effort by a factor of $n$, where $n$ is the problem dimension. The resulting $(1+1)$-Cholesky-CMA-ES is an elegant algorithm and the perhaps simplest evolution strategy with covariance matrix and step size adaptation.

### Source code

There are an example program for the $(1+1)$-CMA-ES in the Shark library.### Papers describing the algorithm

*Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2006)*, pp. 453-460, ACM Press

*Machine Learning*

## Covariance Matrix Adaptation for Multi-objective Optimization

The covariance matrix adaptation evolution strategy (CMA-ES) is one of the most powerful evolutionary algorithms for real-valued single-objective optimization. We developed a variant of the CMA-ES for multi-objective optimization.

In the new multi-objective CMA-ES (MO-CMA-ES) a population of individuals that adapt their search strategy as in the elitist CMA-ES is maintained. These are subject to multi-objective selection. The selection is based on non-dominated sorting using either the crowding-distance or the contributing hypervolume as second sorting criterion. The MO-CMA-ES inherits important invariance properties, in particular invariance under rotation of the search space, from the original CMA-ES.

### Source code

The MO-CMA-ES is part of the Shark library. There is a tutorial and an example program that can serve as starting points for using the MO-CMA-ES.### Papers describing the algorithm

*Evolutionary Computation*

*Machine Learning*, 2009

## Rprop with Improved Weight-Backtracking

The Rprop (Resilient Backpropagation) algorithm proposed by Riedmiller and Braun is one of the best performing first-order learning methods. We introduce slight modifications of the algorithm that further improve its robustness and learning speed. We suggest to use weight-backtracking for highly non-linear optimization problems. Weight-backtracking means partially retracting “unfavorable” previous update steps. Whether a parameter change was “unfavorable” or not is decided by a heuristic. We propose an improved weight-backtracking heuristic considering both the evolution of the partial derivatives and the value of the objective function.### Source code

The Rprop algorithms are implemented in the Shark library, which comes with several examples using the Rprop for optimization.### References

*Neurocomputing*

*Second International Symposium on Neural Computation (NC 2000)*, pp. 115-121, ICSC Academic Press, 2000)

*International Journal of Computer Standards and Interfaces*

*Proceedings of the International Conference on Neural Networks*, pp. 586-591, IEEE Press, 1993