AIXI

AIXI ['ai̯k͡siː] is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000^[1] and several results regarding AIXI are proved in Hutter's 2005 book Universal Artificial Intelligence.^[2]

AIXI is a reinforcement learning (RL) agent. It maximizes the expected total rewards received from the environment. Intuitively, it simultaneously considers every computable hypothesis (or environment). In each time step, it looks at every possible program and evaluates how many rewards that program generates depending on the next action taken. The promised rewards are then weighted by the subjective belief that this program constitutes the true environment. This belief is computed from the length of the program: longer programs are considered less likely, in line with Occam's razor. AIXI then selects the action that has the highest expected total reward in the weighted sum of all these programs.

^ Marcus Hutter (2000). A Theory of Universal Artificial Intelligence based on Algorithmic Complexity. arXiv:cs.AI/0004001. Bibcode:2000cs........4001H.
^ — (2005). Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Texts in Theoretical Computer Science an EATCS Series. Springer. doi:10.1007/b138233. ISBN 978-3-540-22139-5. S2CID 33352850.

[1] Marcus Hutter (2000). A Theory of Universal Artificial Intelligence based on Algorithmic Complexity. arXiv:cs.AI/0004001. Bibcode:2000cs........4001H.

[uaibook-2] — (2005). Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Texts in Theoretical Computer Science an EATCS Series. Springer. doi:10.1007/b138233. ISBN 978-3-540-22139-5. S2CID 33352850.

[1]

[2]