The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas

This study not only asks human experts to score the effectiveness of AI- and human-generated ideas, but also to execute those ideas.

AI-generated research ideas sounded better (high estimated effectiveness score) but their score drops after the execution.

Usage of AI in Science