# Machine Learning

## List of Papers (Total 1,507)

#### Majority vote ensembles of conformal predictors

We study majority vote ensembles of $$\varepsilon$$-valid conformal predictors (CP). We show that the prediction set $$\varGamma ^\eta$$ produced as the majority vote among the prediction sets $$\varGamma ^\varepsilon _i$$ of k independent $$\varepsilon$$-valid CPs is also valid, for some significance level $$\eta$$; we provide a method to compute $$\varepsilon$$ to achieve...

#### Discovering a taste for the unusual: exceptional models for preference mining

Exceptional preferences mining (EPM) is a crossover between two subfields of data mining: local pattern mining and preference learning. EPM can be seen as a local pattern mining task that finds subsets of observations where some preference relations between labels significantly deviate from the norm. It is a variant of subgroup discovery, with rankings of labels as the target...

#### Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation

Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best...

#### Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP

During the 1980s Michie defined Machine Learning in terms of two orthogonal axes of performance: predictive accuracy and comprehensibility of generated hypotheses. Since predictive accuracy was readily measurable and comprehensibility not so, later definitions in the 1990s, such as Mitchell’s, tended to use a one-dimensional approach to Machine Learning based solely on predictive...

#### Meta-Interpretive Learning from noisy images

Statistical machine learning is widely used in image classification. However, most techniques (1) require many images to achieve high accuracy and (2) do not provide support for reasoning below the level of classification, and so are unable to support secondary reasoning, such as the existence and position of light sources and other objects outside the image. This paper describes...

#### A scalable preference model for autonomous decision-making

Emerging domains such as smart electric grids require decisions to be made autonomously, based on the observed behaviors of large numbers of connected consumers. Existing approaches either lack the flexibility to capture nuanced, individualized preference profiles, or scale poorly with the size of the dataset. We propose a preference model that combines flexible Bayesian...

#### Learning efficient logic programs

When machine learning programs from data, we ideally want to learn efficient rather than inefficient programs. However, existing inductive logic programming (ILP) techniques cannot distinguish between the efficiencies of programs, such as permutation sort (n!) and merge sort $$O(n\;log\;n)$$. To address this limitation, we introduce Metaopt, an ILP system which iteratively learns...

#### Metalearning and Algorithm Selection: progress, state of the art and introduction to the 2018 Special Issue

This article serves as an introduction to the Special Issue on Metalearning and Algorithm Selection. The introduction is divided into two parts. In the the first section, we give an overview of how the field of metalearning has evolved in the last 1–2 decades and mention how some of the papers in this special issue fit in. In the second section, we discuss the contents of this...

#### Meta-QSAR: a large-scale application of meta-learning to drug design and discovery

We investigate the learning of quantitative structure activity relationships (QSARs) as a case-study of meta-learning. This application area is of the highest societal importance, as it is a key step in the development of new medicines. The standard QSAR learning problem is: given a target (usually a protein) and a set of chemical compounds (small molecules) with associated...

#### The online performance estimation framework: heterogeneous ensemble learning for data streams

Ensembles of classifiers are among the best performing classifiers available in many data mining applications, including the mining of data streams. Rather than training one classifier, multiple classifiers are trained, and their predictions are combined according to a given voting schedule. An important prerequisite for ensembles to be successful is that the individual models...

#### Emotion in reinforcement learning agents and robots: a survey

This article provides the first survey of computational models of emotion in reinforcement learning (RL) agents. The survey focuses on agent/robot emotions, and mostly ignores human user emotions. Emotions are recognized as functional in decision-making by influencing motivation and action selection. Therefore, computational emotion models are usually grounded in the agent’s...

#### Simple strategies for semi-supervised feature selection

What is the simplest thing you can do to solve a problem? In the context of semi-supervised feature selection, we tackle exactly this—how much we can gain from two simple classifier-independent strategies. If we have some binary labelled data and some unlabelled, we could assume the unlabelled data are all positives, or assume them all negatives. These minimalist, seemingly naive...

#### Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate...

#### A constrained $$\ell$$ 1 minimization approach for estimating multiple sparse Gaussian or nonparanormal graphical models

Identifying context-specific entity networks from aggregated data is an important task, arising often in bioinformatics and neuroimaging applications. Computationally, this task can be formulated as jointly estimating multiple different, but related, sparse undirected graphical models (UGM) from aggregated samples across several contexts. Previous joint-UGM studies have mostly...

#### Projected estimators for robust semi-supervised classification

For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts. We study this question for classification using the well-known quadratic surrogate loss function. Unlike other approaches to semi-supervised learning, the procedure proposed in this work does not rely on assumptions that are not intrinsic to the...

#### The mechanism of additive composition

Additive composition (Foltz et al. in Discourse Process 15:285–307, 1998; Landauer and Dumais in Psychol Rev 104(2):211, 1997; Mitchell and Lapata in Cognit Sci 34(8):1388–1429, 2010) is a widely used method for computing meanings of phrases, which takes the average of vector representations of the constituent words. In this article, we prove an upper bound for the bias of...