Problem Solving in Computer Science
Summer 2005
Instructors:
Schedule:
 Tuesday, 11:1512:45, BC04
 Thursday, 11:1512:45, BC04
Objectives:
The objective of the course is twofold.
First, the course aims at honing the problem solving skills of
young researchers in computer science.
Second, the course aims at explaining the "research enterprise"
in computer science to beginning doctoral students.
Syllabus:
The first part of the course focuses on four research projects in the
following areas:
 Formalizing design.
 Web programming.
 Distributed algorithms.
 Robotics.
Students will work on the projects in groups of 34 people.
Each project will run for about 3 weeks.
The second part of the course will run in parallel to the first part
and discuss the following topics:
 How to search the literature.
 How to write a proof.
 How to write a paper.
 How to referee a paper.
 How to give a presentation.
Prerequisites:
The course is open to (pre)doctoral students, and to masters students
who plan to pursue a research career.
Grading:
Grades will be awarded on the basis of class participation (40%) and
project work (15% for each project).
Lecture 1 (March 8):
 What is this course about?
 How to write a definition.
 Homework: Find the bug in the proof of Theorem 20 of
John L. Kelley, General Topology, Springer Verlag, 1955.
Lecture 2 (March 10):
 Setting up Problem 1.
 Interface theories versus component theories.
 Homework: Read Sections 2, 3, and 4 of
Interface Theories.
Lecture 3 (March 15):
 A component theory: The relational nets.
 Problem 1: Find an interface theory for the relational nets.
Lecture 4 (March 17):
 An interface theory for the total nets: Port dependencies.
Lecture 5 (March 22):
 An interface theory for the rectangular nets: Input assumptions
and output guarantees.
 An interface theory for the relational nets: Necessary
I/O pairs.
 Discussion: What makes an interface theory "(non)trivial"?
 Modified Problem 1: Find an interface theory for the relational
nets which on the total nets, is as expressive as the portdependency
interfaces.
 Homework: Prove that in the universe of humans and monkeys,
if there is a human that has a monkey ancestor, then there is a human
that has a monkey parent.
Lecture 6 (March 24):
 Combinations of interface theories:
Port dependencies and input assumptions?
Port dependencies and necessary I/O pairs?

How to write a proof.
Lecture 7 (April 5):
 How to write a proof.
 Exercise theorem: The sum of two bounded functions is again
bounded.
 Proof sketch: If a is a bound for f,
and b is a bound for g,
then a+b is a bound for f+g.
 This stands for a proof that follows our handout (which one?).
 Wellfounded induction.
Lecture 8 (April 7):
Lecture 9 (April 12):
 Short presentations by groups.
Summary by Eda Baykan here.
 Possible starting point for exact algorithm:
Network flows.
 Possible starting point for randomized algorithm:
Randomized min cut.
 Possible starting point for approximation algorithm?
 Possible starting points for heuristics:
KernighanLin, simulated annealing, other CAD clustering and
layout heuristics.
Lecture 10 (April 14):
Lecture 11 (April 19):
 Short presentations by groups.
Summary by Gregory Mermoud here.
 We agree to work on the following graph:
For each id in the Citeseer XML, the graph has two nodes,
docid representing a document in the database,
and conid representing a paper not in the database which
is cited by a document in the database (a "context").
If the Citeseer XML has a link that id1 cites id2,
then there is an edge from docid1 to docid2.
If the Citeseer XML has a link that id2 is cited by id1,
then there is an edge from docid1 to conid2.
 Can we take advantage of the special form of the graph?
(E.g., all contexts are sinks; there should not be many cycles.)
 Energy models for graph clustering layout.
Tool: ccvisu
References:
Andreas Noack. An Energy Model for Visual Graph Clustering. GD 2003.
Dirk Beyer and Andreas Noack.
Clustering Software Artifacts Based on Frequent Common Changes. IWPC 2005.
Lecture 12 (April 21):
 Short presentations by groups.
Summary by Gregory Theoduloz here.
 Linear programming and duality.
 Powerlaw distributed edge degrees:
The fraction of nodes with edge degree k is proportional to
1/k^d.
What is d for our graph?
 Can we take advantage of the powerlaw property?
 The solution needs to beat the trivial clustering into 15
isolated nodes and one large remaining cluster, i.e., it needs to
beat the sum of the edge degrees of the 15 anchor nodes without the
largestdegree anchor.
 Better measure of the quality of the solution:
The normalized cost of a solution is the sum of all edges
between different clusters, divided by the geometric mean of the
sizes of all clusters.
Lecture 13 (April 26)
Summary of the short presentations and lecture by Marc Schaub here.
Lecture 14 (April 28)
 Project2 reports: MST, BBGS (or static version here), and GS.

Presentation of problem 3 here.
 Summary of lecture by Alex Susu here.
Lecture 15 (May 10)
Short presentations. Summary by Khaled Bachour here.
Lecture 16 (May 12)
Presentation of
Impossibility of Distributed Consensus with One Faulty Process by
Fischer, Lynch, and Paterson.
Lecture 17 (May 17)
Summary by Wojciech Galuba here.
Lecture 18 (May 19)
Lecture 19 (May 24)
Lecture 20 (May 26)
Summary by Khaled Bachour here.
Lecture 21 (May 31)
Summary by Ali Salehi here.
Lecture 22 (June 2)

TexPoint a powerpoint
add on.

LaTex packages for slides:

Summary by Gregory T. here.
Lecture 23 (Jun 7)
Lecture 24 (Jun 9)
We had the competition. Piglet, the modification of the winner of the
alife competition was definitely the best. The worlds we used: alife4, alife5, alife6, and alife7. The new supervisor code is here.
Project4 reports: MST.