Home About YY Help Changes

Utility is in the Eye of the User: A Critique of NLP Leaderboards

https://arxiv.org/abs/2009.13888
Kawin Ethayarajh, Dan Jurafsky

Model evaluation, Human evaluation

See also Ethayarajh2022authenticity

« Paper/Ethayarajh2020utility »