Generative Pretrained Transformer

1904labs Language Models GPT-2 Language Model Language models are used for a variety of tasks such as text generation, reading comprehension, translation, speech-to-text, information retrieval, and more. This app makes use of the text generation capability of the smallest version…

1904labs Language Models

GPT-2 Language Model

Language models are used for a variety of tasks such as text generation, reading comprehension, translation, speech-to-text, information retrieval, and more. This app makes use of the text generation capability of the smallest version of OpenAI’s GPT-2 model.

At 1904labs, we have fine-tuned the 124M GPT-2 model with different text corpuses to produce text in various styles.

Larger GPT-2 models produce more conherent text. We used the Large 755M parameter model to generate our blog post. The X-Large 1.5B parameter model can produce even more coherent text, such as this article on talking unicorns.

For performance and speed considerations, we used the small, 124M parameter model to perform fine-tuning using several different text corpuses. What is remarkable about the models below is that they learned to generate text in the style of each text corpus. However, because of the smaller model size, the text is less coherent compared to outputs from the larger models.

DISCLAIMER: These models were trained on text from millions of webpages on the internet and occasionally produce text that may be offensive or bias.

Let’s generate some text! Imagine the model is an author…
1. Select a style for the author:

2. Get the author started with a word or phrase:
Delete the recommended prompt for unconditional generation.
3. Length of story:
Valid range 1-1023.
40 is approximately a single sentence.
4. Creativity:
Recommended range .5-1.0.
Higher values are more creative.
Click to learn more…

Temperature (aka creativity): Generally a number between 0-1 (but can be higher) with a higher number generating more random text. Float value controlling randomness in boltzmann distribution. Lower temperature results in less random completions. As the temperature approaches zero, the model will become deterministic and repetitive. Higher temperature results in more random completions.

5. Originality:
Recommended range 20-40.
Higher values are more original.
Click to learn more…

1 means only 1 word is considered for each step (token), resulting in deterministic completions, while 40 means 40 words are considered at each step. 0 (default) is a special setting meaning no restrictions. 40 generally is a good value.

Text generation may take up to several minutes if the tokens parameter is set to a high value.


We used this library to perform the fine-tuning and this as inspiration for building the webapp.