The Expressive Power of Transformers with Chain of Thought
- https://arxiv.org/abs/2310.07923
- William Merrill, Ashish Sabharwal
See also Deng2024from that claimed we can train LLMs to “internalize” Chain-of-thought reasoning. Does this paper prove that that type of internalization is not possible in general?