Machine Learning

Machine Learning is an international forum for research on computational approaches to learning. The journal publishes articles reporting substantive results ...

List of Papers (Total 2,153)

Navigating explanatory multiverse through counterfactual path geometry

Apr 2025 | Sokol, Kacper, Small, Edward, Xuan, Yueqing

Counterfactual explanations are the de facto standard when tasked with interpreting decisions of (opaque) predictive models. Their generation is often subject to technical and domain-specific constraints that aim to maximise their real-life utility. In addition to considering desiderata pertaining to the counterfactual instance itself, guaranteeing existence of a viable path...

Apr 2025
Sokol, Kacper, Small, Edward, Xuan, Yueqing

One transformer for all time series: representing and training with time-dependent heterogeneous tabular data

Apr 2025 | Luetto, Simone, Garuti, Fabrizio, Sangineto, Enver, et al.

There is a recent growing interest in applying Deep Learning techniques to tabular data in order to replicate the success of other Artificial Intelligence areas in this structured domain. Particularly interesting is the case in which tabular data have a time dependence, such as, for instance, financial transactions. However, the heterogeneity of the tabular values, in which...

Apr 2025
Luetto, Simone, Garuti, Fabrizio, Sangineto, Enver, et al.

Drop-in efficient self-attention approximation method

Apr 2025 | François, Damien, Saillot, Mathis, Klein, Jacques, et al.

Transformers have achieved state-of-the-art performance in most common tasks to which they have been applied. Those achievements are attributed to the Self-Attention mechanism at their core. Self-Attention is understood to map the relationship between tokens of any given sequence. This exhaustive mapping incurs massive costs in memory and inference time, as Self-Attention scales...

Apr 2025
François, Damien, Saillot, Mathis, Klein, Jacques, et al.

Cost-sensitive classification with cost uncertainty: do we need surrogate losses?

Apr 2025 | Komisarenko, Viacheslav, Kull, Meelis

In many binary classification applications, the costs of false positives and negatives are imbalanced. Furthermore, there is often uncertainty about the exact costs of these errors. A natural measure-of-interest to be minimised in such scenarios is the expected misclassification cost. We identify many situations where this measure has analytic gradients, and thus it can be used...

Apr 2025
Komisarenko, Viacheslav, Kull, Meelis

Temporal ensemble of multiple patterns’ instances for continuous prediction of events

Mar 2025 | Itzhak, Nevo, Jaroszewicz, Szymon, Moskovitch, Robert

In real-life data of various domains, such as traffic, meteorology, or healthcare data, events may have varying durations. Moreover, heterogeneous multivariate temporal data may consist of varying samplings, including regular sampling in different frequencies or irregular, as well as events data of different types, having fixed or varying duration. We propose to uniformly...

Mar 2025
Itzhak, Nevo, Jaroszewicz, Szymon, Moskovitch, Robert

An unsupervised adversarial domain adaptation based on variational auto-encoder

Mar 2025 | Hassan Pour Zonoozi, Mahta, Seydi, Vahid, Deypir, Mahmood

Collecting a large amount of labeled data in machine learning is always challenging. Often, even with sufficient data, domain differences can cause a shift or bias in data distribution, affecting model performance during testing. Domain adaptation methods, especially adversarial techniques, are effective solutions for these challenges. The goal is to learn a classifier for an...

Mar 2025
Hassan Pour Zonoozi, Mahta, Seydi, Vahid, Deypir, Mahmood

Enhanced route planning with calibrated uncertainty set

Mar 2025 | Tang, Lingxuan, Luo, Rui, Zhou, Zhixin, et al.

This paper investigates the application of probabilistic prediction methodologies in route planning within a road network context. Specifically, we introduce the Conformalized Quantile Regression for Graph Autoencoders (CQR-GAE), which leverages the conformal prediction technique to offer a coverage guarantee, thus improving the reliability and robustness of our predictions. By...

Mar 2025
Tang, Lingxuan, Luo, Rui, Zhou, Zhixin, et al.

Correction to: Nettop: A lightweight-network of orthogonal-plane features for image recognition

Mar 2025 | Nguyen, Thanh Tuan, Nguyen, Thanh Phuong

Mar 2025
Nguyen, Thanh Tuan, Nguyen, Thanh Phuong

Online dimensionality reduction through stacked generalization of spectral methods with deep networks

Mar 2025 | Alvarado-Pérez, Juan Carlos, Garcia, Miguel Angel, Puig, Domènec

Analyzing large volumes of high-dimensional data poses significant challenges. Dimensionality reduction aims to reveal the most prominent properties of data by embedding them into a low-dimensional representation. Spectral dimensionality reduction methods using kernel matrices have been proven to yield optimal results. Online versions of those methods are desirable to...

Mar 2025
Alvarado-Pérez, Juan Carlos, Garcia, Miguel Angel, Puig, Domènec

DatRel: a noise-tolerant data relocation approach for effective synthetic data generation in imbalanced classifiers

Mar 2025 | Sağlam, Fatih

Most machine learning algorithms tend to bias towards the majority class when a dataset exhibits a skewed distribution in the class variable. This is called the class imbalance problem and is frequently encountered in real-life applications. One of the most prevalent methods for addressing class imbalance is data resampling, which generates or removes samples to balance the...

Mar 2025
Sağlam, Fatih

Adaptive optimization for prediction with missing data

Mar 2025 | Bertsimas, Dimitris, Delarue, Arthur, Pauphilet, Jean

When training predictive models on data with missing entries, the most widely used and versatile approach is a pipeline technique where we first impute missing entries and then compute predictions. In this paper, we view prediction with missing data as a two-stage adaptive optimization problem and propose a new class of models, adaptive linear regression models, where the...

Mar 2025
Bertsimas, Dimitris, Delarue, Arthur, Pauphilet, Jean

Neural RELAGGS

Mar 2025 | Pensel, Lukas, Kramer, Stefan

Multi-relational databases are the basis of most consolidated data collections in science and industry today. Most learning and mining algorithms, however, require data to be represented in a propositional form. While there is a variety of specialized machine learning algorithms that can operate directly on multi-relational data sets, propositionalization algorithms transform...

Mar 2025
Pensel, Lukas, Kramer, Stefan

Correction to: Conformal load prediction with transductive graph autoencoders

Mar 2025 | Luo, Rui, Colombo, Nicolo

Mar 2025
Luo, Rui, Colombo, Nicolo

An end-to-end explainability framework for spatio-temporal predictive modeling

Mar 2025 | Altieri, Massimiliano, Ceci, Michelangelo, Corizzo, Roberto

The rising adoption of AI models in real-world applications characterized by sensor data creates an urgent need for inference explanation mechanisms to support domain experts in making informed decisions. Explainable AI (XAI) opens up a new opportunity to extend black-box deep learning models with such inference explanation capabilities. However, existing XAI approaches for...

Mar 2025
Altieri, Massimiliano, Ceci, Michelangelo, Corizzo, Roberto

Likelihood-ratio-based confidence intervals for neural networks

Mar 2025 | Sluijterman, Laurens, Cator, Eric, Heskes, Tom

This paper introduces a first implementation of a novel likelihood-ratio-based approach for constructing confidence intervals for neural networks. Our method, called DeepLR, offers several qualitative advantages: most notably, the ability to construct asymmetric intervals that expand in regions with a limited amount of data, and the inherent incorporation of factors such as the...

Mar 2025
Sluijterman, Laurens, Cator, Eric, Heskes, Tom

Generalized median of means principle for Bayesian inference

Mar 2025 | Minsker, Stanislav, Yao, Shunan

The topic of robustness is experiencing a resurgence of interest in the statistical and machine learning communities. In particular, robust algorithms making use of the so-called median of means estimator were shown to satisfy strong performance guarantees for many problems, including estimation of the mean, covariance structure as well as linear regression. In this work, we...

Mar 2025
Minsker, Stanislav, Yao, Shunan

A survey on self-supervised methods for visual representation learning

Mar 2025 | Uelwer, Tobias, Robine, Jan, Wagner, Stefan Sylvius, et al.

Learning meaningful representations is at the heart of many tasks in the field of modern machine learning. Recently, a lot of methods were introduced that allow learning of image representations without supervision. These representations can then be used in downstream tasks like classification or object detection. The quality of these representations is close to supervised...

Mar 2025
Uelwer, Tobias, Robine, Jan, Wagner, Stefan Sylvius, et al.

Pairwise learning to rank by neural networks revisited: reconstruction, theoretical analysis and practical performance

Mar 2025 | Köppel, Marius, Segner, Alexander, Wagener, Martin, et al.

We reevaluate the pairwise learning to rank approach based on neural nets, called RankNet, and present a theoretical analysis of its architecture. We show mathematically that the model can, under certain conditions, learn reflexive, antisymmetric, and transitive relations, enabling simplified training and improved performance. Experimental results on the LETOR MSLR-WEB10K, MQ2007...

Mar 2025
Köppel, Marius, Segner, Alexander, Wagener, Martin, et al.

Intramodal consistency in triplet-based cross-modal learning for image retrieval

Feb 2025 | Mallea, Mario, Ñanculef, Ricardo, Araya, Mauricio

Cross-modal retrieval requires building a common latent space that captures and correlates information from different data modalities, usually images and texts. Cross-modal training based on the triplet loss with hard negative mining is a state-of-the-art technique to address this problem. This paper shows that such approach is not always effective in handling intra-modal...

Feb 2025
Mallea, Mario, Ñanculef, Ricardo, Araya, Mauricio

Deep Errors-in-Variables using a diffusion model

Feb 2025 | Faller, Josua, Martin, Jörg, Elster, Clemens

Errors-in-Variables is the statistical concept used to explicitly model input variable errors caused, for example, by noise. While it has long been known in statistics that not accounting for such errors can produce a substantial bias, the vast majority of deep learning models have thus far neglected Errors-in-Variables approaches. Reasons for this include a significant increase...

Feb 2025
Faller, Josua, Martin, Jörg, Elster, Clemens

TCR: topologically consistent reweighting for XGBoost in regression tasks

Feb 2025 | Zühlke, Monty-Maximilian, Kudenko, Daniel

Gradient boosted tree ensembles (GBTEs) such as XGBoost continue to outperform other machine learning models on tabular data. However, the plethora of adjustable hyperparameters can exacerbate optimisation, especially in regression tasks with no intuitive performance measures such as accuracy and confidence. Automated machine learning frameworks alleviate the hyperparameter...

Feb 2025
Zühlke, Monty-Maximilian, Kudenko, Daniel

On the usefulness of the fit-on-test view on evaluating calibration of classifiers

Feb 2025 | Kängsepp, Markus, Valk, Kaspar, Kull, Meelis

Calibrated uncertainty estimates are essential for classifiers used in safety-critical applications. If a classifier is uncalibrated, then there is a unique way to calibrate its uncertainty using the idealistic true calibration map corresponding to this classifier. Although the true calibration map is typically unknown in practice, it can be estimated with many post-hoc...

Feb 2025
Kängsepp, Markus, Valk, Kaspar, Kull, Meelis

Testing exchangeability in the batch mode with e-values and Markov alternatives

Feb 2025 | Vovk, Vladimir

The topic of this paper is testing the assumption of exchangeability, which is the standard assumption in mainstream machine learning. The common approaches are online testing by betting (such as conformal testing) and the older batch testing using p-values (as in classical hypothesis testing). The approach of this paper is intermediate in that we are interested in batch testing...

Feb 2025
Vovk, Vladimir

Calibrated explanations for regression

Feb 2025 | Löfström, Tuwe, Löfström, Helena, Johansson, Ulf, et al.

Artificial Intelligence (AI) methods are an integral part of modern decision support systems. The best-performing predictive models used in AI-based decision support systems lack transparency. Explainable Artificial Intelligence (XAI) aims to create AI systems that can explain their rationale to human users. Local explanations in XAI can provide information about the causes of...

Feb 2025
Löfström, Tuwe, Löfström, Helena, Johansson, Ulf, et al.

HorNets: learning from discrete and continuous signals with routing neural networks

Feb 2025 | Koloski, Boshko, Lavrač, Nada, Škrlj, Blaž

Construction of neural network architectures suitable for learning from both continuous and discrete tabular data is challenging, as contemporary high-dimensional tabular data sets are often characterized by a relatively small set of instances and the request for efficient learning. We propose HorNets (Horn Networks), a neural network architecture with state-of-the-art...

Feb 2025
Koloski, Boshko, Lavrač, Nada, Škrlj, Blaž