|
Measuring and Extracting Proximity in Networks
Yehuda Koren, Stephen North, Chris Volinsky
How can we find out how two or more people are related in a
social network? The solution to this is significant in avoiding
loss due to fraud in communication services and understanding on-line
and other communities. As an example using publicly available data,
which actors "connect" Meryl Streep, Groucho Marx and Judy Garland?
In computer science, how are the inventors of the RSA encryption
algorithm related to the theorist Donald Knuth? Click on the examples below.
|
|
| DBLP: network between Knuth, Rivest, Adelman, Shamir |
IMDB: network between Marx, Garland, Streep |
You can try our spiffy
online demo
that uses IMDB and the DBLP computer science bibliography,
or learn more
here. (Note: IMDB covers entertainers who appeared in TV
programs, including many musicians and other well known figures. Try some!)
In this work, we propose a way to numerically measure the proximity between
objects in a quasi-random network (such as a social or on-line network)
and to visualize a compact subnetwork that explains or "proves" the result.
In general, measuring distance or some other form of proximity between
objects is a key mining method. Connection subgraphs were recently
proposed by Faloutsos, Tomkins and McCurley (KDD 2004) as a way to
demonstrate proximity between nodes in networks. We propose a new
way of measuring and extracting proximity called "cycle free effective
conductance" (CFEC). Our proximity measure
handles more than two endpoints, directed edges, is statistically
well-behaved, and yields an effectiveness score for the computed
subgraphs. We have developed efficient heuristics for estimating
proximity, and conducted experiments on three large network data
sets: a telecommunications calling graph, the IMDB actors graph,
and an academic co-authorship network.
<<
Back to Projects & Software
|