Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
- https://arxiv.org/pdf/2201.10474.pdf
- Suchin Gururangan, Dallas Card, Sarah K. Dreier, Emily K. Gade, Leroy Z. Wang, Zeyu Wang, Luke Zettlemoyer, Noah A. Smith