The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
- https://arxiv.org/abs/2402.17764
- Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei
- [ ] Based on BitNet (Wang2023BitNet).
every single parameter (or weight) of the LLM is ternary {-1, 0, 1}. It matches the full-precision (i.e., FP16 or BF16) Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance.