Santiago
Santiago

@svpino

6 Tweets 2 reads Jan 16, 2023
2022 saw tremendous interest in large-scale models, and 2023 will bring more of the same.
But 99% of us can't train or fine-tune these models!
For example, the company behind Stable Diffusion runs a 4,000 NVIDIA A100 GPU cluster with $50M in operating costs!
A solution:
Colossal-AI is an open-source deep learning system that lets you train and fine-tune large models using a single consumer-grade GPU.
They just released version 0.2.0, which includes automatic parallelism and reduces hardware costs by up to 46 times!
Here are three highlights:
1. Train or fine-tune Stable Diffusion 2.0 with up to 5.6x GPU memory reduction and 46 times reduction in hardware costs.
2. Run stand-alone inference on the 175B parameter BLOOM model with a 4-fold reduction in GPU memory consumption and hardware costs reduced by over 10 times.
3. It takes one line of code to search for the best parallelism strategy, making distributed training easier and natively supporting popular AI model libraries like Hugging Face and Timm.
And you can run everything on a consumer-level GPU, like an RTX 2070/3050 PC!
Open source code: github.com
They have a Slack channel where you can ask any questions: colossalaiworkspace.slack.com
Here is a blog post covering their new changes: hpc-ai.tech
Thanks to the team behind Colossal-AI for their work and partnership!
Great end-to-end tutorial here.

Loading suggestions...