Chain rule (probability)

In probability theory, the chain rule^[1] (also called the general product rule^[2]^[3]) describes how to calculate the probability of the intersection of, not necessarily independent, events or the joint distribution of random variables respectively, using conditional probabilities. This rule allows one to express a joint probability in terms of only conditional probabilities.^[4] The rule is notably used in the context of discrete stochastic processes and in applications, e.g. the study of Bayesian networks, which describe a probability distribution in terms of conditional probabilities.

^ Schilling, René L. (2021). Measure, Integral, Probability & Processes - Probab(ilistical)ly the Theoretical Minimum. Technische Universität Dresden, Germany. p. 136ff. ISBN 979-8-5991-0488-9.{{cite book}}: CS1 maint: location missing publisher (link)
^ Schum, David A. (1994). The Evidential Foundations of Probabilistic Reasoning. Northwestern University Press. p. 49. ISBN 978-0-8101-1821-8.
^ Klugh, Henry E. (2013). Statistics: The Essentials for Research (3rd ed.). Psychology Press. p. 149. ISBN 978-1-134-92862-0.
^ Virtue, Pat. "10-606: Mathematical Foundations for Machine Learning" (PDF).