tf-idf term frequency - inverse document frequency wikipedia:tfidf Tools NLTK: http://www.nltk.org/api/nltk.html?highlight=tf%20idf#nltk.text.TextCollection.idf Gensim: http://radimrehurek.com/gensim/models/tfidfmodel.html scikit learn: http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html