MuZero

MuZero is a computer program developed by artificial intelligence research company DeepMind to master games without knowing their rules.[1][2][3] Its release in 2019 included benchmarks of its performance in go, chess, shogi, and a standard suite of Atari games. The algorithm uses an approach similar to AlphaZero. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go (setting a new world record), and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain.

MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual architecture as AlphaZero, but with 20 percent fewer computation steps per node in the search tree.[4]

MuZero’s capacity to plan and learn effectively without explicit rules makes it a groundbreaking achievement in reinforcement learning and AI, pushing the boundaries of what is possible in artificial intelligence.

  1. ^ Wiggers, Kyle (20 November 2019). "DeepMind's MuZero teaches itself how to win at Atari, chess, shogi, and Go". VentureBeat. Retrieved 22 July 2020.
  2. ^ Friedel, Frederic. "MuZero figures out chess, rules and all". ChessBase GmbH. Retrieved 22 July 2020.
  3. ^ Rodriguez, Jesus. "DeepMind Unveils MuZero, a New Agent that Mastered Chess, Shogi, Atari and Go Without Knowing the Rules". KDnuggets. Retrieved 22 July 2020.
  4. ^ Schrittwieser, Julian; Antonoglou, Ioannis; Hubert, Thomas; Simonyan, Karen; Sifre, Laurent; Schmitt, Simon; Guez, Arthur; Lockhart, Edward; Hassabis, Demis; Graepel, Thore; Lillicrap, Timothy (2020). "Mastering Atari, Go, chess and shogi by planning with a learned model". Nature. 588 (7839): 604–609. arXiv:1911.08265. Bibcode:2020Natur.588..604S. doi:10.1038/s41586-020-03051-4. PMID 33361790. S2CID 208158225.