The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
- https://arxiv.org/abs/2506.20803
- Chenglei Si, Tatsunori Hashimoto, Diyi Yang
This study not only asks human experts to score the effectiveness of AI- and human-generated ideas, but also to execute those ideas.
AI-generated research ideas sounded better (high estimated effectiveness score) but their score drops after the execution.