Introducing `llama-405b-to-8b` โ๏ธ
Get the quality of Llama 3.1 405B, at a fraction of the cost and latency.
Give one example of your task, and 405B will teach 8B (~30x cheaper!!) how to do the task perfectly.
And it's open-source: github.com
Get the quality of Llama 3.1 405B, at a fraction of the cost and latency.
Give one example of your task, and 405B will teach 8B (~30x cheaper!!) how to do the task perfectly.
And it's open-source: github.com
This was made in partnership with @OctoAICloud โ particularly Ben Hamm, who adapted my existing prompt optimization tools to take advantage of the new Llama 3.1 models.
This approach was inspired by this tweet that went viral months ago.
I discovered that if you prompt Haiku w/ Opus-generated examples, it can match Opus' quality.
Now, we have even better 'teacher' models than Opus, and cheaper 'student' models than Haiku.
x.com
I discovered that if you prompt Haiku w/ Opus-generated examples, it can match Opus' quality.
Now, we have even better 'teacher' models than Opus, and cheaper 'student' models than Haiku.
x.com
In production, Llama 3.1 405B-level AI quality at a low cost, with near-instant results, is a game changer.
This notebook makes it possible for anyone to implement this quickly.
So how does it work?
This notebook makes it possible for anyone to implement this quickly.
So how does it work?
You give the AI a description of your task, along with one input/output example. That's it.
From there, it will generate seven other great, diverse examples that are similar in structure to your example.
It'll then use those + the task description to generate a system prompt.
From there, it will generate seven other great, diverse examples that are similar in structure to your example.
It'll then use those + the task description to generate a system prompt.
If you're building w/ LLMs, you NEED to try this.
If you'd like to try it or contribute, check out the Github repo, and check out @OctoAICloud if you're looking for scalable/fast/reliable inference for your models!
github.com
github.com
Loading suggestions...