Draft:Bayesian model comparison


Bayesian Model Comparison
File:BayesTheorem.png
Bayes' Theorem
Bayes' Theorem, a fundamental concept in Bayesian statistics, is used to update the probability of a hypothesis as more evidence or information becomes available.
Key Concepts
Bayes FactorsA ratio of marginal likelihoods that quantifies the evidence in favor of one model compared to another.
Marginal Likelihood (Evidence)The probability of the observed data given a model, integrated over all possible parameter values.
Posterior Model ProbabilityThe probability that a model is true given the observed data and prior information.
Information CriteriaApproximations to Bayes Factors, e.g., BIC, AIC, DIC.
Predictive AccuracyHow well a model predicts new or unseen data, often assessed through cross-validation or WAIC.
Model AveragingCombining predictions from multiple models, weighted by their posterior probabilities or predictive performance.
Methods
Separate EstimationComparing models based on posterior predictive distributions, Bayes factors, and information criteria.
Comparative EstimationAssessing the 'distance' between posterior distributions using measures like Kullback-Leibler divergence.
Simultaneous EstimationExploring the model space using techniques like reversible jump MCMC (RJMCMC) or birth-and-death MCMC (BDMCMC).

Bayesian model comparison means comparing how well statistical models fit to data by Bayesian statistics. It is used for diverse tasks like variable selection in regression, determining the number of components in a mixture model, and choosing parametric families. The goal of model comparison may be selecting a single "best" model, or improve estimation via model ensemble averaging, where expectation values from different models are weighted-averaged by their posterior probabilities.

Common methods for Bayesian model comparison include:

  • Separate estimation: Comparing models through posterior predictive distributions, Bayes factors, and approximations like BIC and DIC.
  • Comparative estimation: Assessing the "distance" between posterior distributions using measures like Kullback-Leibler divergence.
  • Simultaneous estimation: Exploring the model space using techniques like RJMCMC or BDMCMC.