Multiple EM for Motif Elicitation

Multiple Expectation maximizations for Motif Elicitation (MEME) is a tool for discovering motifs in a group of related DNA or protein sequences.^[1]

A motif is a sequence pattern that occurs repeatedly in a group of related protein or DNA sequences and is often associated with some biological function. MEME represents motifs as position-dependent letter-probability matrices which describe the probability of each possible letter at each position in the pattern. Individual MEME motifs do not contain gaps. Patterns with variable-length gaps are split by MEME into two or more separate motifs.

MEME takes as input a group of DNA or protein sequences (the training set) and outputs as many motifs as requested. It uses statistical modeling techniques to automatically choose the best width, number of occurrences, and description for each motif.

MEME is the first of a collection of tools for analyzing motifs called the MEME suite.

^ Bailey T.L., Elkan C. Unsupervised Learning of Multiple Motifs In Biopolymers Using EM. Mach. Learn. 1995;21:51–80.

[Bailey_and_Elkan_1995-1] Bailey T.L., Elkan C. Unsupervised Learning of Multiple Motifs In Biopolymers Using EM. Mach. Learn. 1995;21:51–80.

[1]