The XML file is composed of records
. Each record R corresponds to one publication P and lists
all publications P references and those that references it. These lists are represented through two types of relations. The first relation, the ``References'' relation, contains the ids of publications referenced by P. The second second, the ``Is Referenced By'' relation, contains ids of contexts in which P is referenced.
Since we have no difference between the semantic meaning of the two types of edges, we convert the graph from the directed to undirected. We also remove any self-loops and repeated edges.