Neocognitron


The neocognitron is a hierarchical, multilayered artificial neural network proposed by Kunihiko Fukushima in 1979.[1][2] It has been used for Japanese handwritten character recognition and other pattern recognition tasks, and served as the inspiration for convolutional neural networks.[3]

Previously in 1969, he published a similar architecture, but with hand-designed kernels inspired by convolutions in mammalian vision.[4] In 1979 he improved it to the neocognitron, which learns all convolutional kernels by unsupervised learning (in his terminology, "self-organized by 'learning without a teacher'").[2]

The neocognitron was inspired by the model proposed by Hubel & Wiesel in 1959. They found two types of cells in the visual primary cortex called simple cell and complex cell, and also proposed a cascading model of these two types of cells for use in pattern recognition tasks.[5][6]

The neocognitron is a natural extension of these cascading models. The neocognitron consists of multiple types of cells, the most important of which are called S-cells and C-cells.[7] The local features are extracted by S-cells, and these features' deformation, such as local shifts, are tolerated by C-cells. Local features in the input are integrated gradually and classified in the higher layers.[8] The idea of local feature integration is found in several other models, such as the Convolutional Neural Network model, the SIFT method, and the HoG method.

There are various kinds of neocognitron.[9] For example, some types of neocognitron can detect multiple patterns in the same input by using backward signals to achieve selective attention.[10]

  1. ^ Fukushima, Kunihiko (October 1979). "位置ずれに影響されないパターン認識機構の神経回路のモデル --- ネオコグニトロン ---" [Neural network model for a mechanism of pattern recognition unaffected by shift in position — Neocognitron —]. Trans. IECE (in Japanese). J62-A (10): 658–665.
  2. ^ a b Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position" (PDF). Biological Cybernetics. 36 (4): 193–202. doi:10.1007/BF00344251. PMID 7370364. S2CID 206775608. Archived (PDF) from the original on 3 June 2014. Retrieved 16 November 2013.
  3. ^ LeCun, Yann; Bengio, Yoshua; Hinton, Geoffrey (2015). "Deep learning" (PDF). Nature. 521 (7553): 436–444. Bibcode:2015Natur.521..436L. doi:10.1038/nature14539. PMID 26017442. S2CID 3074096.
  4. ^ Fukushima, Kunihiko (1969). "Visual Feature Extraction by a Multilayered Network of Analog Threshold Elements". IEEE Transactions on Systems Science and Cybernetics. 5 (4): 322–333. doi:10.1109/TSSC.1969.300225. ISSN 0536-1567.
  5. ^ David H. Hubel and Torsten N. Wiesel (2005). Brain and visual perception: the story of a 25-year collaboration. Oxford University Press US. p. 106. ISBN 978-0-19-517618-6.
  6. ^ Hubel, DH; Wiesel, TN (October 1959). "Receptive fields of single neurones in the cat's striate cortex". J. Physiol. 148 (3): 574–91. doi:10.1113/jphysiol.1959.sp006308. PMC 1363130. PMID 14403679.
  7. ^ Fukushima 1987, p. 83.
  8. ^ Fukushima 1987, p. 84.
  9. ^ Fukushima 2007.
  10. ^ Fukushima 1987, pp. 81, 85.