Multinomial distribution

Multinomial
Parameters	number of trials; number of mutually exclusive events (integer); event probabilities, where
Support
PMF
Mean
Variance	;
Entropy
MGF
CF	where
PGF

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

When k is 2 and n is 1, the multinomial distribution is the Bernoulli distribution. When k is 2 and n is bigger than 1, it is the binomial distribution. When k is bigger than 2 and n is 1, it is the categorical distribution. The term "multinoulli" is sometimes used for the categorical distribution to emphasize this four-way relationship (so n determines the suffix, and k the prefix).

The Bernoulli distribution models the outcome of a single Bernoulli trial. In other words, it models whether flipping a (possibly biased) coin one time will result in either a success (obtaining a head) or failure (obtaining a tail). The binomial distribution generalizes this to the number of heads from performing n independent flips (Bernoulli trials) of the same coin. The multinomial distribution models the outcome of n experiments, where the outcome of each trial has a categorical distribution, such as rolling a k-sided die n times.

Let k be a fixed finite number. Mathematically, we have k possible mutually exclusive outcomes, with corresponding probabilities p₁, ..., p_k, and n independent trials. Since the k outcomes are mutually exclusive and one must occur we have p_i ≥ 0 for i = 1, ..., k and $\sum _{i=1}^{k}p_{i}=1$ . Then if the random variables X_i indicate the number of times outcome number i is observed over the n trials, the vector X = (X₁, ..., X_k) follows a multinomial distribution with parameters n and p, where p = (p₁, ..., p_k). While the trials are independent, their outcomes X_i are dependent because they must be summed to n.

Parameters	$n\in \{0,1,2,\ldots \}$ number of trials $k>0$ number of mutually exclusive events (integer) $p_{1},\ldots ,p_{k}$ event probabilities, where $p_{1}+\dots +p_{k}=1$
Support	$\left\lbrace (x_{1},\dots ,x_{k})\,{\Big \vert }\,\sum _{i=1}^{k}x_{i}=n,x_{i}\geq 0\ (i=1,\dots ,k)\right\rbrace$
PMF	${\frac {n!}{x_{1}!\cdots x_{k}!}}p_{1}^{x_{1}}\cdots p_{k}^{x_{k}}$
Mean	$\operatorname {E} (X_{i})=np_{i}$
Variance	$\operatorname {Var} (X_{i})=np_{i}(1-p_{i})$ $\operatorname {Cov} (X_{i},X_{j})=-np_{i}p_{j}~~(i\neq j)$
Entropy	$-\log(n!)-n\sum _{i=1}^{k}p_{i}\log(p_{i})+\sum _{i=1}^{k}\sum _{x_{i}=0}^{n}{\binom {n}{x_{i}}}p_{i}^{x_{i}}(1-p_{i})^{n-x_{i}}\log(x_{i}!)$
MGF	${\biggl (}\sum _{i=1}^{k}p_{i}e^{t_{i}}{\biggr )}^{n}$
CF	$\left(\sum _{j=1}^{k}p_{j}e^{it_{j}}\right)^{n}$ where $i^{2}=-1$
PGF	${\biggl (}\sum _{i=1}^{k}p_{i}z_{i}{\biggr )}^{n}{\text{ for }}(z_{1},\ldots ,z_{k})\in \mathbb {C} ^{k}$