Peter Dayan

Peter Dayan
Royal Society 2018
Alma materUniversity of Cambridge (BA)
University of Edinburgh (PhD)
Known forQ-learning
SpouseLi Zhaoping
AwardsRumelhart Prize (2012)
The Brain Prize (2017)
Scientific career
FieldsComputational neuroscience
Reinforcement learning
InstitutionsMax Planck Institute for Biological Cybernetics
University College London
Massachusetts Institute of Technology
Uber[1]
University of Toronto
Salk Institute
ThesisReinforcing connectionism : learning the statistical way (1991)
Doctoral advisorDavid Willshaw
Websitewww.kyb.tuebingen.mpg.de/person/95844/251691

Peter Dayan FRS is a British neuroscientist and computer scientist who is director at the Max Planck Institute for Biological Cybernetics in Tübingen, Germany, along with Ivan De Araujo. He is co-author of Theoretical Neuroscience,[2] an influential textbook on computational neuroscience. He is known for applying Bayesian methods from machine learning and artificial intelligence to understand neural function and is particularly recognized for relating neurotransmitter levels to prediction errors and Bayesian uncertainties.[3] He has pioneered the field of reinforcement learning (RL) where he helped develop the Q-learning algorithm, and made contributions to unsupervised learning, including the wake-sleep algorithm for neural networks and the Helmholtz machine.[4][5][6]

  1. ^ Ghahramani, Zoubin (2017). "Welcoming Peter Dayan to Uber AI Labs". uber.com. Archived from the original on 15 March 2018.
  2. ^ Dayan, Peter; Abbott, Laurence (2014). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. Cambridge: MIT Press. ISBN 9780262541855. OCLC 952504127.
  3. ^ Schultz, W.; Dayan, P.; Montague, P. R. (1997). "A Neural Substrate of Prediction and Reward" (PDF). Science. 275 (5306): 1593–1599. doi:10.1126/science.275.5306.1593. ISSN 0036-8075. PMID 9054347. S2CID 220093382. Closed access icon
  4. ^ Watkins, Christopher J. C. H.; Dayan, Peter (1992). "Q-learning". Machine Learning. 8 (3–4): 279–292. doi:10.1007/BF00992698. hdl:21.11116/0000-0002-D738-D. ISSN 0885-6125.
  5. ^ Dayan, Peter (1992). "The convergence of TD (λ) for general λ". Machine Learning. 8 (3/4): 341–362. doi:10.1023/A:1022632907294. hdl:21.11116/0000-0002-D743-0. ISSN 0885-6125.
  6. ^ Peter, Dayan; Hinton, Geoffrey E.; Neal, Radford M.; Zemel, Richard S. (1995). "The helmholtz machine". Neural Computation. 7 (5): 889–904. doi:10.1162/neco.1995.7.5.889. hdl:21.11116/0000-0002-D6D3-E. PMID 7584891. S2CID 1890561. Closed access icon