Machine Learning and Data Mining Research at DIKU
The amount and complexity of available data is steadily increasing. To make use of this wealth of information, computing systems are needed that turn the data into knowledge. Machine learning is about developing the required software that automatically analyses data for making predictions, categorizations, and recommendations. Machine learning algorithms are already an integral part of today's computing systems - for example in search engines, recommender systems, or biometrical applications - and have reached superhuman performance in some domains. DIKU's research pushes the boundaries and aims at more robust, more efficient, and more widely applicable machine learning techniques.
State-of-the-art machine learning
![[hydroacoustic signal
classification]](images/blubbern.png)
We apply machine learning algorithms for
hydroacoustic signal
classification to
support the verification of the
Comprehensive Nuclear-Test-Ban Treaty.
Machine learning is a branch of computer science and applied statistics covering software that improves its performance at a given task based on sample data or experience. The machine learning research at DIKU, the Department of Computer Science at the University of Copenhagen, is concerned with the design and analysis of adaptive systems for pattern recognition and behaviour generation.
Our fields of expertise are
- classification, regression, and density estimation techniques for data mining and modelling, pattern recognition, and time series prediction; and
- computational intelligence methods for non-linear optimisation including vector optimisation and multi-criteria decision making.
![[image segmentation]](images/sliceNew.png)
![[image segmentation]](images/sSN.png)
![[image registration] style=](images/registration.png)
Medical image analysis is a major application area
[taken from
Prasoon et al., 2012, (top) and
Winter et al., 2008 (bottom)].
Successful real-world applications include the design of biometric and medical image processing systems, chemical processes and plants, advanced driver assistance systems, robot controllers, time series predictors for physical processes, systems for sports analytics, acoustic signal classification systems, automatic quality control for production lines, and sequence analysis in bioinformatics.
To build efficient and autonomous machine learning systems we draw inspiration from optimisation and computing theory as well as biological information processing. We analyse our algorithms theoretically and critically evaluate them on real-world problems. Increasing the robustness and improving scalability of self-adaptive, learning computer systems are cross-cutting issues in our work. The following sections highlight some of our research activities.
Efficient autonomous machine learning
We strive for computer systems that can deal autonomously and flexibly with our needs. They must work in scenarios that have not been fully specified and must be able to cope with unpredicted situations. Incomplete descriptions of application scenarios are inevitable because we need algorithms for domains where the designer's knowledge is not perfect, the solutions to particular problems are simply unknown, and/or the sheer complexity and variability of the task and the environment precludes a sufficiently accurate domain description. Although such systems are in general too complex to be designed manually, large amounts of data describing the task and the environment are often available or can be automatically obtained. To take proper advantage of this available information, we need to develop systems that self-adapt and automatically improve based on sample data – systems that learn.
Machine learning algorithms are already an integral part of today's computing systems, for example in internet search engines, recommender systems, or biometrical applications. Highly specialised technical solutions for restricted task domains exist that have reached superhuman performance. Despite these successes, there are fundamental challenges that must be met if we are to develop more general learning systems.
First, present adaptive systems often lack autonomy and robustness. For example, they usually require a human expert to select the training examples, the learning method and its parameters, and an appropriate representation or structure for the learning system. This dependence on expert supervision is retarding the ubiquitous deployment of adaptive software systems. We therefore work on algorithms that can handle large multimodal data sets, that actively select training patterns, and that autonomously build appropriate internal representations based on data from different sources. These representations should foster learning, generalisation, and communication. Second, current adaptive systems succumb to scalability problems.
On the one hand, the ever growing amounts of data require highly efficient large-scale learning algorithms. On the other hand, learning and generalisation from very few examples is also a challenging problem. This scenario often occurs in man-machine interaction, for example in software personalisation or when generalisation from few database queries is required. We address the scaling problems by using task-specific architectures incorporating both new concepts inspired by natural adaptive systems as well as recent methods from algorithmic engineering and mathematical programming.
Selected methods
We address all major learning paradigms, unsupervised, supervised, and reinforcement learning. These are closely connected. For instance, unsupervised learning can be used to find appropriate representations for supervised learning and reliable supervised learning techniques are the prerequisite for successful reinforcement learning. Over the years, we used, analysed, and refined a broad spectrum of machine learning techniques. Currently our methodological research focuses on the following methods.
Supervised learning
![[multi-class support vector machine]](images/mcsvm.png)
Schema of multi-class support vector machine classification
[taken from Dogan et al., 2011].
Support vector machines (SVMs) and other kernel-based algorithms are state-of-the-art in pattern recognition. They perform well in many applications, especially in classification tasks. The kernel trick allows for an easy handling of non-standard data (e.g., biological sequences, multimodal data) and permits a better mathematical analysis of the adaptive system because of the convenient structure of the hypothesis space. Developing and analysing kernel-based methods, in particular increasing autonomy and improving scalability of SVMs, is currently one of the most active branches of our research.
Reinforcement learning
![[CMA-ES]](images/CMA_cartoon.gif)
Covariance matrix adaptation
evolution strategy (CMA-ES).
The feedback in today's most challenging applications for adaptive systems is sparse, unspecific, and/or delayed, for instance in autonomous robotics or in man-machine interaction. Supervised learning cannot be used directly in such a case, but the task can be cast into a reinforcement learning (RL) problem. Reinforcement learning is learning from the consequences of interactions with an environment without being explicitly taught. Because the performance of standard RL techniques is falling short of expectations, we are developing new RL algorithms relying on gradient-based and evolutionary direct policy search.
![[Direct policy search for adaptation in
intelligent driver assistance systems.]](images/das.png)
Direct policy search for adaptation in
intelligent driver assistance systems
[taken from Pellecchia et al.,
2005].
Unsupervised learning
![[RBM]](images/RBM.png)
Markov random field
for rerepresenting data.
We employ probabilistic generative models to learn and to describe probability distributions. Our research focuses on Markov random fields, in which the conditional independence structure between random variables is described by an undirected graph. We are particularly interested in models that allow for learning hierarchical representations of data in an unsupervised manner.
Non-linear optimisation
![[MO-CMA-ES]](images/ContributedHypervolumeANew.png)
Contributing hypervolume of
candidate solutions in
multi-objective optimization
[Suttorp et al, 2006].
Learning is closely linked to optimisation. Thus, we are also working on general gradient-based and direct search and optimisation algorithms. This includes randomised methods, especially evolutionary algorithms (EAs), which are inspired by neo-Darwinian evolution theory. Efficient evolutionary optimisation can be achieved by an automatic adjustment of the search strategy. We are developing EAs with this ability, especially real-valued EAs that learn the metric underlying the problem at hand (e.g., dependencies between variables). Currently, we are working on variable-metric EAs for RL and for efficient vector (multi-objective) optimisation. The latter will become increasingly relevant for industrial and scientific applications in the future, because many problems are inherently multi-objective.
Selected Publications
Please click here for a full list.
Contact
Christian Igel, Professor mso, Dr. habil.| University of Copenhagen |
| Universitetsparken 5 |
| 2100 København Ø |
| Email: | igel@diku.dk |
| Office: | HCØ - Building E, Office 4.0.2 | Phone: | (+45) 21849673 |
