10 Tweets 14 reads Dec 26, 2022
ChatGPT is taking over the internet. But do you know how it actually works? It's so clever.
🧵Here's an explanation using simple words:
First, the tech behind ChatGPT isn’t new. It’s based on ā€œGPT-3.5ā€ an upgraded version of GPT-3 which became available to the public many months ago. Yet. not much was being built around it until now.
1/6
The first step to create ChatGPT was to adjust GPT-3.5 for conversations. They literally had human AI trainers provide conversations in which they played both sides—the user and an AI assistant.
In other words, they paid people to chit-chat.
2/6
With a model capable of generating answers similar to humans, they needed a way to tell the AI what was a good/bad answer.
To solve that, they used humans (again) to rank randomly selected answers that ChatGPT was spitting out from best to worst.
3/6
The rank was then used to train a second model they called the "reward" model.
So there are two models:
1. a model that can answer questions like a human.
2. a model that can say how good/bad the answers was.
The last step is brilliant.
4/6
The last step was to train Reinforcement Learning model which is similar to dog training where a reward is given for a "good" behavior,
So what was the "reward" here? Spoiler: Not a cookie. They used the score as reward to train the model.
5/6
The recipe:
1. Have a model generate a human-like answer.
2. Have a model score that answer.
3. Have model learn from the score and re-adjust the answer until it gets an A+.
4. Repeat a million times until accurate.
*chef kiss*
6/6
If you want to stay up to date with the latest breakthroughs in AI check out our weekly summary. It's read by 25,000+ ML engineers and researchers.
alphasignal.ai
Fun Facts:
- Every chat costs is in the single-digits cents.
- It will be monetized soon.
- ChatGPT was trained on Azure
- @sama @elonmusk @ilyasut @gdb @woj_zaremba @johnschulman2 are the founders of @OpenAI.
- There are no main authors behind ChatGPT.
My friend Louis @Whats_AI did a cool video about it, check it out:
youtube.com

Loading suggestions...