Data Contamination Can Cross Language Barriers

Training data leakage and memorization in language models