Information Geometry PhD Course

PhD Course

Information Geometry in Learning and Optimization

Basic information — Lectures — Practical sessions — Background

Background

Principles of Information Geometry have been successfully applied in all major areas of machine learning, including supervised, unsupervised, and reinforcement learning, as well as in stochastic optimization. Information Geometry comes into play when we consider parametrized probabilistic models (e.g., in the context of stochastic behavioral policies, search distributions, stochastic neural networks, ...) and their adaptation. Technically speaking, in Information Geometry the space of probability distributions that can be represented by a parametrized probabilistic model is described as a manifold, on which the Fisher information metric defines a Riemannian structure. Through the geometry of the Riemannian manifold of distributions, optimization and statistics can be done directly on the space of distributions.

Information geometry was founded and pioneered by Shun'ichi Amari in the 1980s, with statistical learning as one of the first applications. Due to the nonlinear nature of the space of distributions, the steepest ascent direction for adapting a probability distribution parametrized by a set of real-valued parameters (e.g., the mean and the covariance of a Gaussian distribution) is not the ordinary gradient in Euclidean space, but the so called natural gradient, defined with respect to the Riemannian structure of the space of distributions. The natural gradient is natural in the sense that it renders the adaptation invariant under reparametrization and changing representations, and it is closely linked to the Kullback-Leibler divergence often used for quantifying the similarity of distributions.

The natural gradient for adapting probabilistic models has been successfully used in all major areas of machine learning, from supervised learning of neural networks over independent component analysis to reinforcement learning. In this PhD course there will, in particular, be lectures on supervised learning, reinforcement learning and stochastic optimization. Reinforcement learning refers to machine learning algorithms that improve their behavior based on interaction with the environment, whereas stochastic optimization refers to stochastic solutions to complex optimization problems for which we do not have an analytical description. Both in stochastic optimization and reinforcement learning, (intermediate) solutions are best described by probability distributions. In the one case, we consider distributions over potential actions to be taken in a certain situation. In the other case, we consider the search distribution describing which candidate solution to probe next. Thus, both the learning as well as the optimization process are best described by an iterative update of probability distributions.

Confirmed Speakers

Shun'ichi Amari, RIKEN Brain Science Institute
Nihat Ay, Max Planck Institute for Mathematics in the Sciences and Universität Leipzig
Nikolaus Hansen, Université Paris-Sud and Inria Saclay – Île-de-France
Jan Peters, Technische Universität Darmstadt and Max-Planck Institute for Intelligent Systems
Luigi Malagò, Shinshu University, Nagano
Aasa Feragen, University of Copenhagen
Francois Lauze, University of Copenhagen
Stefan Sommer, University of Copenhagen

Scientific content

The course will consist of 5 days of lectures and exercises. In addition, students will be expected to read a pre-defined set of scientific articles on information geometry prior to the course, and write a report on information geometry and its potential use in their own research field after the course. The course will consist of three modules:

A crash course on Riemannian geometry and numerical tools for applications of Riemannian geometry
Introduction to Information Geometry and its role in Machine Learning and Stochastic Optimization
Applications of Information Geometry

Learning goals

After participating in this course, the participant should

Understand basic differential geometric concepts (manifolds, Riemannian metric, geodesics, manifold statistics) to the point where they can apply differential geometric concepts in their own research;
Be able to implement basic numerical tools for differential geometric computations;
Have a strong knowledge of information geometry and its role in machine learning and stochastic optimization;
Be able to apply information theoretic approaches to machine learning and stochastic optimization in their own research;
Have a basic knowledge of existing applications of information geometry.

Organizers

Christian Igel, University of Copenhagen
Aasa Feragen, University of Copenhagen

The Image Group – University of Copenhagen

Søg / Search

Global navigation

PhD Course

Information Geometry in Learning and Optimization

Background

Confirmed Speakers

Scientific content

Learning goals

Organizers

Oversigt / Site navigation

The Image Group

Ekstra information / Sidebar

Machine Learning Lab

Home

Lab Members

Publications

Downloads

Projects

Contact

Announcements

PhD Course on Information Geometry in Learning and Optimization

Shark library wins prize.

Kolofon / Footer

The Image Group – University of Copenhagen

Hop til / Skip to:

Søg / Search

Global navigation

PhD Course

Information Geometry in Learning and Optimization

Background

Confirmed Speakers

Scientific content

Learning goals

Organizers

Oversigt / Site navigation

The Image Group

Ekstra information / Sidebar

Machine Learning Lab

Home

Lab Members

Publications

Downloads

Projects

Contact

Announcements

PhD Course on Information Geometry in Learning and Optimization

Shark library wins prize.

Kolofon / Footer