Bipartite network embedding

Why?

Graph embedding is a powerful, unifying way to represent network data.

Many networks are bipartite in nature.

Many bipartite networks are studied as a unipartite network by applying an one-mode projection. However, this projection destroys information as illustrated in Lehmann2008biclique and other papers.

Different perspectives

Recommender systems and item-feature matrix

One very common setting that can also be interpreted as bipartite network is item-feature matrix. For instance, term-document matrix can be thought as a bipartite network between terms and documents. Recommender system setting is also similar, where we have a bipartite network of users and items. The connections can potentially be weighted as well.

The matrix can then be factorized to obtain low-rank representations for items and features. For instance, in the term-document setting, the document vectors can represent the strengths of “topics” and then term vectors can represent how close the term is to each topic. But here the embedding space is distinct between the two types of nodes and they are not in the same space.

Co-embedding

In Machine learning, the idea of embedding multiple types of entities into the same vector space has been explored as “Co embedding”. Globerson2007euclidean is an early method.

Methods

Euclidean latent space models

Globerson2007euclidean may be one of the earlier methods. This paper proposes a method to embed multiple types of entities within a single vector space by using the linkage between those entities. In other words, bipartite networks can be one example of their proposed Co embedding scenarios. They assume that $p(x, y) \propto e^{-d^2_{x,y}}$ , where $d_{x,y}$ is the Euclidean distance between two vectors. They also correctly identifies that it is important to consider the marginal probability (like degree in the bipartite network).

If we consider just general networks including unipartite ones, Hoff2002latent is an early method paper that proposed an embedding method based on location and covariates. Friel2016interlocking takes a similar approach for a bipartite network of boards and directors for the companies.

Hyperbolic and other geometry

Kitsak2017latent