reading the GPTQ paper, about post-training quantization for GPTs https://t.co/JGpdhWOCtG it can quantize 175B models in 4 GPU hours down to 3/4
reading the GPTQ paper, about post-training quantization for GPTs https://t.co/JGpdhWOCtG it can quantize 175B models in 4 GPU hours down to 3/4