ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
A reduced LLMs from BERT. It reduces the space complexity of BERT by using
- Matrix factorization of word piece embedding by considering context-independent representation and context-dependent representation.
- Parameter sharing across layers.