Uniformly Random Walk

Our algorithm is based on the following random walk:
Start from the seed node. For any node: with probability $1-p_{return}$ uniformly randomly select the next neighbor to hop to or with probability $p_{return}$ go back to the seed node.

For all seed nodes the random walks are independent from each other. What we are interested in are the asymptotic probabilities of visiting a node as the number of hops tends to infinity. Our random walk is memoryless and Markovian, hence the probabilities converge. Moreover, we know that there is a system of linear equations corresponding to the random walk. The steady state equations are as follows:

For every regular node: $p_v=\Sigma_{u\in{neighbors(v)}}1-p_{return}\frac{p_u}{degree(p_u)}$

For every seed node: $p_v=\Sigma_{u\in{neighbors(v)}}1-p_{return}\frac{p_u}{degree(u)}+\Sigma_{u\in{V}}p_{return}*p_u$

where $neighbors(u)$ is the set of neighbors of $u$ and $degree(u)$ is the degree of vertex $u$ and $p_u$ is the assymptotic probability of visiting of vertex $u$ that we are looking for.

We solve this system of equations by iteratively finding the eigenvector. There is one system of equations per seed node. In the end, for every node we obtain a vector of probabilities, each corresponding to a random walk returning to a different seed node. To assign a node to a particular cluster we select the seed node for which the probability of visiting is the highest.

This algorithm does not guarantee that the final components will be contiguous. However, the only parts of the graph that can be noncontiguous are the ones far from the seed nodes (in terms of the shortest path). There is no unambiguous way of assigning these nodes to any of the clusters solely based on the topological properties of the graph. We therefore modify the problem by relaxing the constraint that all nodes have to be assigned to some cluster. In the algorithm we proceed as before, selecting the cluster for which the probability of visiting is the highest, but if the highest probability is below a threshold then the node is not added to any cluster.

Ali Salehi 2005-04-29