For all seed nodes the random walks are independent from each other. What we are interested in are the asymptotic probabilities of visiting a node as the number of hops tends to infinity. Our random walk is memoryless and Markovian, hence the probabilities converge. Moreover, we know that there is a system of linear equations corresponding to the random walk. The steady state equations are as follows:
For every regular node:
For every seed node:
where is the set of neighbors of and is the degree of vertex and is the assymptotic probability of visiting of vertex that we are looking for.
We solve this system of equations by iteratively finding the eigenvector. There is one system of equations per seed node. In the end, for every node we obtain a vector of probabilities, each corresponding to a random walk returning to a different seed node. To assign a node to a particular cluster we select the seed node for which the probability of visiting is the highest.
This algorithm does not guarantee that the final components will be contiguous. However, the only parts of the graph that can be noncontiguous are the ones far from the seed nodes (in terms of the shortest path). There is no unambiguous way of assigning these nodes to any of the clusters solely based on the topological properties of the graph. We therefore modify the problem by relaxing the constraint that all nodes have to be assigned to some cluster. In the algorithm we proceed as before, selecting the cluster for which the probability of visiting is the highest, but if the highest probability is below a threshold then the node is not added to any cluster.
Ali Salehi 2005-04-29