Conformal prediction

Conformal prediction (CP) is a machine learning framework for uncertainty quantification that produces statistically valid prediction regions (prediction intervals) for any underlying point predictor (whether statistical, machine, or deep learning) only assuming exchangeability of the data. CP works by computing nonconformity scores on previously labeled data, and using these to create prediction sets on a new (unlabeled) test data point. A transductive version of CP was first proposed in 1998 by Gammerman, Vovk, and Vapnik,^[1] and since, several variants of conformal prediction have been developed with different computational complexities, formal guarantees, and practical applications.^[2]

Conformal prediction requires a user-specified significance level for which the algorithm should produce its predictions. This significance level restricts the frequency of errors that the algorithm is allowed to make. For example, a significance level of 0.1 means that the algorithm can make at most 10% erroneous predictions. To meet this requirement, the output is a set prediction, instead of a point prediction produced by standard supervised machine learning models. For classification tasks, this means that predictions are not a single class, for example 'cat', but instead a set like {'cat', 'dog'}. Depending on how good the underlying model is (how well it can discern between cats, dogs and other animals) and the specified significance level, these sets can be smaller or larger. For regression tasks, the output is prediction intervals, where a smaller significance level (fewer allowed errors) produces wider intervals which are less specific, and vice versa – more allowed errors produce tighter prediction intervals.^[3]^[4]^[5]^[6]

^ Gammerman, Alexander; Vovk, Vladimir; Vapnik, Vladimir (1998). "Learning by transduction". Uncertainty in Artificial Intelligence. 14: 148–155.
^ Angelopoulos, Anastasios; Bates, Stephen (2021). "A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification". arXiv:2107.07511 [cs.LG].
^ Vovk, Vladimir (2022). Algorithmic learning in a random world. A. Gammerman, Glenn Shafer. New York: Springer. doi:10.1007/978-3-031-06649-8. ISBN 978-3-031-06648-1. S2CID 118783209.
^ Cite error: The named reference :2 was invoked but never defined (see the help page).
^ Norinder, Ulf; Carlsson, Lars; Boyer, Scott; Eklund, Martin (2014-06-23). "Introducing Conformal Prediction in Predictive Modeling. A Transparent and Flexible Alternative to Applicability Domain Determination". Journal of Chemical Information and Modeling. 54 (6): 1596–1603. doi:10.1021/ci5001168. ISSN 1549-9596. PMID 24797111.
^ Alvarsson, Jonathan; McShane, Staffan Arvidsson; Norinder, Ulf; Spjuth, Ola (2021-01-01). "Predicting With Confidence: Using Conformal Prediction in Drug Discovery". Journal of Pharmaceutical Sciences. 110 (1): 42–49. doi:10.1016/j.xphs.2020.09.055. ISSN 0022-3549. PMID 33075380. S2CID 224809705.

[first-paper-1] Gammerman, Alexander; Vovk, Vladimir; Vapnik, Vladimir (1998). "Learning by transduction". Uncertainty in Artificial Intelligence. 14: 148–155.

[angelopolous-bates-2] Angelopoulos, Anastasios; Bates, Stephen (2021). "A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification". arXiv:2107.07511 [cs.LG].

[:0-3] Vovk, Vladimir (2022). Algorithmic learning in a random world. A. Gammerman, Glenn Shafer. New York: Springer. doi:10.1007/978-3-031-06649-8. ISBN 978-3-031-06648-1. S2CID 118783209.

[:2-4] Cite error: The named reference :2 was invoked but never defined (see the help page).

[5] Norinder, Ulf; Carlsson, Lars; Boyer, Scott; Eklund, Martin (2014-06-23). "Introducing Conformal Prediction in Predictive Modeling. A Transparent and Flexible Alternative to Applicability Domain Determination". Journal of Chemical Information and Modeling. 54 (6): 1596–1603. doi:10.1021/ci5001168. ISSN 1549-9596. PMID 24797111.

[6] Alvarsson, Jonathan; McShane, Staffan Arvidsson; Norinder, Ulf; Spjuth, Ola (2021-01-01). "Predicting With Confidence: Using Conformal Prediction in Drug Discovery". Journal of Pharmaceutical Sciences. 110 (1): 42–49. doi:10.1016/j.xphs.2020.09.055. ISSN 0022-3549. PMID 33075380. S2CID 224809705.

[1]

[2]

[3]

[4]

[5]

[6]