SHARK is a fast, modular, feature-rich open-source C++ machine learning library. It provides methods for linear and nonlinear optimization, kernel-based learning algorithms, neural networks, and various other machine learning techniques (see the feature list below). It serves as a powerful toolbox for real world applications as well as research. Shark depends on Boost and CMake. It is compatible with Windows, Solaris, MacOS X, and Linux. Shark is licensed under the permissive GNU Lesser General Public License.

For an overview over the previous major release of Shark (2.0) we refer to:

Christian Igel, Verena Heidrich-Meisner, and Tobias Glasmachers. Shark. Journal of Machine Learning Research 9, pp. 993-996, 2008. [Bibtex]

Where to start

In the menu above, click on “Getting started”, or use this direct link to the installation instructions. After installation, there is a guide to the different documentation pages available here.

Why Shark?

Speed and flexibility

Shark provides an excellent trade-off between flexibility and ease-of-use on the one hand, and computational efficiency on the other.

One for all

Shark offers numerous algorithms from various machine learning and computational intelligence domains in a way that they can be easily combined and extended.

Unique features

Shark comes with a lot of powerful algorithms that are to our best knowledge not implemented in any other library, for example in the domains of model selection and training of binary and multi-class SVMs, or evolutionary single- and multi-objective optimization.

Selected features

Shark currently supports:

  • Supervised learning
    • Linear discriminant analysis (LDA), Fisher–LDA
    • Linear regression
    • Support vector machines (SVMs) for one-class, binary and true multi-category classification as well as regression; includes fast variants for linear kernels.
    • Feed-forward and recurrent multi-layer artificial neural networks
    • Radial basis function networks
    • Regularization networks as well as Gaussian processes for regression
    • Iterative nearest neighbor classification and regression
    • Decision trees and random forests
  • Unsupervised learning
    • Principal component analysis
    • Restricted Boltzmann machines (including many state-of-the-art learning algorithms)
    • Hierarchical clustering
    • Data structures for efficient distance-based clustering
  • Evolutionary algorithms
    • Single-objective optimization (e.g., CMA–ES)
    • Multi-objective optimization (in particular, highly efficient algorithms for computing as well as approximating the contributing hypervolume)
  • Basic linear algebra and optimization algorithms