Speculative Sampling: Accelerating Text Generation https://t.co/2IE4qmIfhQ DeepMind achieves 2-2.5x faster token sampling on a 70B parameter mod
Speculative Sampling: Accelerating Text Generation https://t.co/2IE4qmIfhQ DeepMind achieves 2-2.5x faster token sampling on a 70B parameter mod