We consider a problem of clustering publications of the citation network given seed papers as an input. The algorithm in its most general form is NP-hard. Given the Citeseer citation data, we solve the problem by using the randomized contraction heuristic and provide an analytical bounds on the number of necessary runs. In addition we experimentally demonstrate the scale-free and small-world properties of the network. We propose two heuristics for solving the clustering problem based on phenomena occuring in complex networks: epidemic spreading and random walks.

- Introduction
- Dataset

- Related work

- The structure of the graph
- The Randomized Min-Cut Algorithm

- Epidemic Spreading Algorithm and Random Walk Approach

- Theoretical Analysis

- Future work
- Improved Version of Randomized Min Cut:Faster Min Cut
- Random walks
- Epidemic spreading
- Augmenting the graph

- Conclusion
- Bibliography
- .
- About this document ...

Ali Salehi 2005-04-29