MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation

LLMs, Agent, AI Team, Automated machine learning, Auto GPT

13 tasks about improving ML models. Tests how well Agent can improve ML models.