Capsule neural network

A capsule neural network (CapsNet) is a machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. The approach is an attempt to more closely mimic biological neural organization.^[1]

The idea is to add structures called "capsules" to a convolutional neural network (CNN), and to reuse output from several of those capsules to form more stable (with respect to various perturbations) representations for higher capsules.^[2] The output is a vector consisting of the probability of an observation, and a pose for that observation. This vector is similar to what is done for example when doing classification with localization in CNNs.

Among other benefits, capsnets address the "Picasso problem" in image recognition: images that have all the right parts but that are not in the correct spatial relationship (e.g., in a "face", the positions of the mouth and one eye are switched). For image recognition, capsnets exploit the fact that while viewpoint changes have nonlinear effects at the pixel level, they have linear effects at the part/object level.^[3] This can be compared to inverting the rendering of an object of multiple parts.^[4]

^ Cite error: The named reference :1 was invoked but never defined (see the help page).
^ Hinton, Geoffrey E.; Krizhevsky, Alex; Wang, Sida D. (2011-06-14). "Transforming Auto-Encoders". Artificial Neural Networks and Machine Learning – ICANN 2011. Lecture Notes in Computer Science. Vol. 6791. Springer, Berlin, Heidelberg. pp. 44–51. CiteSeerX 10.1.1.220.5099. doi:10.1007/978-3-642-21735-7_6. ISBN 9783642217340. S2CID 6138085.
^ Srihari, Sargur. "Capsule Nets" (PDF). University of Buffalo. Retrieved 2017-12-07.
^ Hinton, Geoffrey E; Ghahramani, Zoubin; Teh, Yee Whye (2000). Solla, S. A.; Leen, T. K.; Müller, K. (eds.). Advances in Neural Information Processing Systems 12 (PDF). MIT Press. pp. 463–469.

[:1-1] Cite error: The named reference :1 was invoked but never defined (see the help page).

[2] Hinton, Geoffrey E.; Krizhevsky, Alex; Wang, Sida D. (2011-06-14). "Transforming Auto-Encoders". Artificial Neural Networks and Machine Learning – ICANN 2011. Lecture Notes in Computer Science. Vol. 6791. Springer, Berlin, Heidelberg. pp. 44–51. CiteSeerX 10.1.1.220.5099. doi:10.1007/978-3-642-21735-7_6. ISBN 9783642217340. S2CID 6138085.

[:16-3] Srihari, Sargur. "Capsule Nets" (PDF). University of Buffalo. Retrieved 2017-12-07.

[:0-4] Hinton, Geoffrey E; Ghahramani, Zoubin; Teh, Yee Whye (2000). Solla, S. A.; Leen, T. K.; Müller, K. (eds.). Advances in Neural Information Processing Systems 12 (PDF). MIT Press. pp. 463–469.

[1]

[2]

[3]

[4]