Cosine similarity
A commonly used measure of similarity for Continuous embedding vectors. The main justification for this is that Euclidean distance does not work well in high-dimensional space (read more here).
Issues
Cosine distance vs. angular distance
Cosine similarity is often converted into “distance” (from to ): However, this may not be a good measure of distance. - Because the cosine function’s derivative is 0 and decreasing around 0, the cosine distance does not have a good “resolution” around 0, which is often the most important domain of interest. - See Michael Trosset’s note (Sec. 1.3 and 2.7)
Alternatively, we can use the angle as the distance: Google’s Universal Sentence Encoding paper uses the following similarity: