Statistical machine translation

Statistical machine translation (SMT) was a machine translation approach, that superseded the previous, rule-based approach because it required explicit description of each and every linguistic rule, which was costly, and which often did not generalize to other languages. Since 2003, the statistical approach itself has been gradually superseded by the deep learning-based neural machine translation.

The first ideas of statistical machine translation were introduced by Warren Weaver in 1949,^[1] including the ideas of applying Claude Shannon's information theory. Statistical machine translation was re-introduced in the late 1980s and early 1990s by researchers at IBM's Thomas J. Watson Research Center^[2]^[3]^[4]

^ W. Weaver (1955). Translation (1949). In: Machine Translation of Languages, MIT Press, Cambridge, MA.
^ P. Brown; John Cocke; S. Della Pietra; V. Della Pietra; Frederick Jelinek; Robert L. Mercer; P. Roossin (1988). "A statistical approach to language translation". Coling'88. 1. Association for Computational Linguistics: 71–76. Retrieved 22 March 2015.
^ P. Brown; John Cocke; S. Della Pietra; V. Della Pietra; Frederick Jelinek; John D. Lafferty; Robert L. Mercer; P. Roossin (1990). "A statistical approach to machine translation". Computational Linguistics. 16 (2). MIT Press: 79–85. Retrieved 22 March 2015.
^ P. Brown; S. Della Pietra; V. Della Pietra; R. Mercer (1993). "The mathematics of statistical machine translation: parameter estimation". Computational Linguistics. 19 (2). MIT Press: 263–311. Retrieved 22 March 2015.

[1] W. Weaver (1955). Translation (1949). In: Machine Translation of Languages, MIT Press, Cambridge, MA.

[brown88-2] P. Brown; John Cocke; S. Della Pietra; V. Della Pietra; Frederick Jelinek; Robert L. Mercer; P. Roossin (1988). "A statistical approach to language translation". Coling'88. 1. Association for Computational Linguistics: 71–76. Retrieved 22 March 2015.

[brown90-3] P. Brown; John Cocke; S. Della Pietra; V. Della Pietra; Frederick Jelinek; John D. Lafferty; Robert L. Mercer; P. Roossin (1990). "A statistical approach to machine translation". Computational Linguistics. 16 (2). MIT Press: 79–85. Retrieved 22 March 2015.

[brown93-4] P. Brown; S. Della Pietra; V. Della Pietra; R. Mercer (1993). "The mathematics of statistical machine translation: parameter estimation". Computational Linguistics. 19 (2). MIT Press: 263–311. Retrieved 22 March 2015.

[1]

[2]

[3]

[4]