GPT-2

Generative Pre-trained Transformer 2 (GPT-2)
Original author(s)OpenAI
Initial release14 February 2019; 5 years ago (14 February 2019)
Repositoryhttps://github.com/openai/gpt-2
PredecessorGPT-1
SuccessorGPT-3
Type
LicenseMIT[1]
Websiteopenai.com/blog/gpt-2-1-5b-release/

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages.[2] It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.[3][4][5]

GPT-2 was created as a "direct scale-up" of GPT-1[6] with a ten-fold increase in both its parameter count and the size of its training dataset.[5] It is a general-purpose learner and its ability to perform the various tasks was a consequence of its general ability to accurately predict the next item in a sequence,[2][7] which enabled it to translate texts, answer questions about a topic from a text, summarize passages from a larger text,[7] and generate text output on a level sometimes indistinguishable from that of humans, however it could become repetitive or nonsensical when generating long passages.[8] It was superseded by the GPT-3 and GPT-4 models, which are no longer open source.

GPT-2 has, like its predecessor GPT-1 and its successors GPT-3 and GPT-4, a generative pre-trained transformer architecture, implementing a deep neural network, specifically a transformer model,[6] which uses attention instead of older recurrence- and convolution-based architectures.[9][10] Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant.[11][12] This model allows for greatly increased parallelization, and outperforms previous benchmarks for RNN/CNN/LSTM-based models.[6]

  1. ^ "gpt-2". GitHub. Archived from the original on 11 March 2023. Retrieved 13 March 2023.
  2. ^ a b Cite error: The named reference gpt2paper was invoked but never defined (see the help page).
  3. ^ Cite error: The named reference verge2 was invoked but never defined (see the help page).
  4. ^ Cite error: The named reference 15Brelease was invoked but never defined (see the help page).
  5. ^ a b Cite error: The named reference openai was invoked but never defined (see the help page).
  6. ^ a b c Cite error: The named reference gpt1paper was invoked but never defined (see the help page).
  7. ^ a b Cite error: The named reference badpaper was invoked but never defined (see the help page).
  8. ^ Cite error: The named reference guardian was invoked but never defined (see the help page).
  9. ^ Cite error: The named reference attention was invoked but never defined (see the help page).
  10. ^ Cite error: The named reference attentionRNNs was invoked but never defined (see the help page).
  11. ^ Cite error: The named reference jointly was invoked but never defined (see the help page).
  12. ^ Cite error: The named reference effective was invoked but never defined (see the help page).