Browse by author
Lookup NU author(s): Dr Toni Karvonen, Professor Chris Oates
This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
Gaussian process regression underpins countless academic and industrial applications of machine learning and statistics, with maximum likelihood estimation routinely used to select appropriate parameters for the covariance kernel. However, it remains an open problem to establish the circumstances in which maximum likelihood estimation is well-posed, that is, when the predictions of the regression model are insensitive to small perturbations of the data. This article identifies scenarios where the maximum likelihood estimator fails to be well-posed, in that the predictive distributions are not Lipschitz in the data with respect to the Hellinger distance. These failure cases occur in the noiseless data setting, for any Gaussian process with a stationary covariance function whose lengthscale parameter is estimated using maximum likelihood. Although the failure of maximum likelihood estimation is part of Gaussian process folklore, these rigorous theoretical results appear to be the first of their kind. The implication of these negative results is that well-posedness may need to be assessed post-hoc, on a case-by-case basis, when maximum likelihood estimation is used to train a Gaussian process model.
Author(s): Karvonen T, Oates CJ
Publication type: Article
Publication status: Published
Journal: Journal of Machine Learning Research
Year: 2023
Volume: 24
Issue: 120
Pages: 1-47
Online publication date: 01/02/2023
Acceptance date: 01/02/2023
Date deposited: 28/06/2024
ISSN (electronic): 1533-7928
Publisher: Journal of Machine Learning Research (Online)
URL: http://jmlr.org/papers/v24/22-1153.html