7 January 2009

Measuring and Extracting Proximity in Networks

Yehuda Koren, Stephen North, Chris Volinsky

How can we find out how two or more people are related in a social network? The solution to this is significant in avoiding loss due to fraud in communication services and understanding on-line and other communities. As an example using publicly available data, which actors "connect" Meryl Streep, Groucho Marx and Judy Garland? In computer science, how are the inventors of the RSA encryption algorithm related to the theorist Donald Knuth? Click on the examples below.

DBLP: network between Knuth, Rivest, Adelman, Shamir IMDB: network between Marx, Garland, Streep

You can try our spiffy online demo that uses IMDB and the DBLP computer science bibliography, or learn more here. (Note: IMDB covers entertainers who appeared in TV programs, including many musicians and other well known figures. Try some!)

In this work, we propose a way to numerically measure the proximity between objects in a quasi-random network (such as a social or on-line network) and to visualize a compact subnetwork that explains or "proves" the result. In general, measuring distance or some other form of proximity between objects is a key mining method. Connection subgraphs were recently proposed by Faloutsos, Tomkins and McCurley (KDD 2004) as a way to demonstrate proximity between nodes in networks. We propose a new way of measuring and extracting proximity called "cycle free effective conductance" (CFEC). Our proximity measure handles more than two endpoints, directed edges, is statistically well-behaved, and yields an effectiveness score for the computed subgraphs. We have developed efficient heuristics for estimating proximity, and conducted experiments on three large network data sets: a telecommunications calling graph, the IMDB actors graph, and an academic co-authorship network.

<< Back to Projects & Software

 


Terms & Conditions | Privacy Statement | Copyright © 2009 AT&T