Visualizing complex structures using Graphviz

The Graphviz [1] program has long been well-established in the open source landscape (see the "Out of Bell Labs" box). In many cases, this tool does its job unnoticed in the background, for example, automatically generating images on a web server from a database via a script.

Out of Bell Labs

The Graphviz cross-platform visualization software is originally from AT&T and their Bell Labs. The software has been around since 1988, is under Eclipse Public License (EPL) [2], and runs under Linux, Solaris, Windows, and Mac OS X. Stable releases are available for all distributions.

The AsciiDoc [3] markup language has the ability to embed Graphviz data directly. The compiler automatically interprets the code sections into corresponding graphs.

Plugins are available for Doxygen [4] and many wiki platforms. Graphviz also has a presence in Puppet [5], and it is used for creating resource graphs.

To prepare directed and undirected graphs (see also the "Graph Types" box), the command-line tools dot , neato , fdp , circo , and twopi can help. These are all part of the Graphviz package, and they use specific algorithms from graph theory to reduce edge lengths, separate subgraphs, and recognize connectivity components [6].

Graph Types

A graph consists of a set of nodes and edges, with an edge always connecting two nodes. There are two types of graphs: undirected and directed. With an undirected graph, the edges don't indicate a direction. With a directed graph, all edges have a direction. This direction is what leads your eye, like a one-way street.

Workflow

You first use a text editor to describe the elements and their interdependencies and save this as a text file.

The next step is translating the data with one of the aforementioned tools into an output format (e.g., GIF, PNG, SVG, PDF, or Postscript). The following command creates a PNG file from graph.dot :

$ dot -Tpng graph.dot -o graph.png

Building the dot file is quite easy. First, you define a graph (Listing 1, line 1). The keyword digraph creates a directed graph. Always remember the curly bracket after the keyword.

Listing 1

Build a Dot File

01 digraph G {
02 A -> B [style=dotted];
03 A -> C [color=red, label="A Label"];
04 C [shape=box, style=filled, color="0.7 0.7 1.0"];
05 B -> C;
06 }

In the lines that follow, you specify the three nodes A, B, and C with corresponding attributes to create the nodes and edges. Each line ends with a semicolon. For the node name, you can use either an identifier (as in the C programming language), a number, or a quoted string.

Dot recognizes blank, control, and other punctuation characters as well. Multiple lines are marked with \n for a line break. The label then appears in the middle of the code.

The -> creates an edge between two points in the direction indicated. The curly brackets on line 6 end the definition statement. You control the graph's appearance by passing attributes, which are in square brackets. These attributes can be global or local and appear on lines 2 and 3.

Line 2 defines a dotted line, and line 3 defines a red line with a label for the edge with the label attribute followed by an equal sign and the label content in quotes. In line 4, you define a box for node C, which is colored blue on the inside. The exact color value comes from the RGB components. Figure 1 shows the source code and the result in an image viewer.

Figure 1: A color visualization of a directed graph with three nodes.

With Graphviz you can, among other things, visualize directory hierarchies. An undirected graph can show a tree structure where possibly undesired folders are located. With the folder option, you can assign small icons to nodes very much like those you see for file manager objects (Listing 2).

Listing 2

Assign Icons to Nodes

01 graph {
02 "/" [shape=folder];
03 "/boot" [shape=folder];
04 "/usr" [shape=folder];
05 "/home" [shape=folder];
06 "/var" [shape=folder];
07 "/frank" [shape=folder];
08 "/peter" [shape=folder];
09 "/" -- "/boot";
10 "/" -- "/usr";
11 "/" -- "/home";
12 "/" -- "/var";
13 "/home" -- "/frank";
14 "/home" -- "/peter";
15 }

You translate this file very much like the earlier example. Figure 2 shows the resulting graph. If you use one of the other tools, it will translate the description without an error, but it might represent the folders with circles, which is contrary to the usual view of a filesystem.

Figure 2: Graphviz comes with a little icon that you can use to represent folders in a directory hierarchy.

With a bit of skill, you can generate the dot files from a script. In this way, you could automatically get an overview of all the branches in your filesystem.

Binary Trees

Trees exist as data structure made up of nodes and edges, where the root is the top node. Binary tree have two kinds of nodes – outside (without children) and inside (with exactly two children). The primary application of such binary trees is structuring and creating the most efficient search for data sets.

Graphviz takes the burden off of creating these structures if you use the program in the right way. Listing 3 shows how you can create such a binary tree with the software. After setting all nodes as rectangles on line 2, you then define the structure and content of the nodes in lines 4 through 8.

Listing 3

Create a Binary Tree

01 digraph G {
02 node [shape=record, height=0.1];
03
04 node0 [label = "<l> | <m> H | <r>"];
05 node1 [label = "<l> | <m> D | <r>"];
06 node2 [label = "<l> | <m> A | <r>"];
07 node3 [label = "<l> | <m> P | <r>"];
08 node4 [label = "<l> | <m> W | <r>"];
09
10 node0:l -> node1:m;
11 node1:l -> node2:m;
12 node0:r -> node3:m;
13 node1:r -> node4:m;
14 }

Each node has three elements, a reference to the left node (labeled l ), a middle element (labeled m ) with content, and the right node (labeled r ). These elements you will later refer to by name.

In line 10, you connect left node0 with middle node1, and so forth in the lines that follow. You separate the node name with that of the matching element with a colon, which determines where the arrows point. The follow-up representation creates the desired boxes and lines (Figure 3).

Figure 3: Graphviz automatically puts the arrows for the binary search tree in the right places if you use the correct syntax.

Data Structures as Hashes

In programming, data structures often come in the form of hashes, such as associative arrays [7]. Graphviz builds these structures without much fuss if you use the correct syntax. Line 2 in Listing 4 specifies that you want a directed graph – the LR determines the direction from left to right.

Listing 4

Create a Data Structure

01 digraph G {
02 rankdir=LR;
03 node [shape=record, width=0.1, height=0.1];
04
05 node0 [label = "<f0> | <f1> | <f2>", height=1.5];
06 node [width=1.5];
07 node1 [label = "{<n> n14 | 719 | <p>}"];
08 node2 [label = "{<n> k71 | 216 | <p>}"];
09 node3 [label = "{<n> n39 | 771 | <p>}"];
10 node4 [label = "{<n> k56 | 250 | <p>}"];
11 node5 [label = "{<n> a34 | 125 | <p>}"];
12
13 node0:f0 -> node1:n;
14 node1:p -> node2:n;
15 node0:f1 -> node3:n;
16 node0:f2 -> node4:n;
17 node4:p -> node5:n;
18 }

Lines 5 through 11 define the content of the nodes as additional structures and later access the separate elements by name. The node0 in line 5 has three elements (f0 , f1 , and f2 ), whereas node1 through node5 in lines 7 through 11 each have the two elements n and p . The middle element has no identifier, only the assigned value. Linking the nodes occurs in lines 3 through 17, and Graphviz takes over from there (Figure 4).

Figure 4: If you want to see the relationship of hash tables when programming, Graphviz helps with the necessary functions.

Dependencies

The next example comes directly from the engine room of a Linux system. With the Debtree [8] program, you can explore the dependencies between packages on a Debian-based system (e.g., Ubuntu). The parameters are the names of one or more packages – for simplicity's sake, I'll use names like sqlite3 for the SQLite3 [9] database.

The program provides the result in dot format. The call in Listing 5 creates the listing on the command line and passes the results to the sqlite3.dot file (Figure 5). From this, you generate an image file with dot (Figure 6).

Listing 5

Generate an Image File

01 $ debtree sqlite3 > sqlite3.dot
02 $ dot -Tpng sqlite3.dot > sqlite3.png

Figure 5: The dot description of the sqlite3 package dependencies on a Debian-based system.

Figure 6: With Debtree and Graphviz, you can see in an instant what kind of a rat tail an installation drags behind it.

Drawing Paths

Sometimes a certain path is interesting in a graph, such as the shortest path between two points – as in planning a road trip. Usually edges get additional properties, like distances, road conditions, obstacles, or traffic jams. All of these factors contribute to evaluating a route.

Figure 7 shows a relatively simple example of such a path of a graph. The red color and changing width of the line are done using the color and penwidth attributes.

Figure 7: Path in a network.

To get a uniform hexagon, use the command circo when translating Listing 6 – all the other elements in the call remain the same.

Listing 6

Draw a Path

01 graph {
02 a -- b -- d -- c -- f[color=red,penwidth=3.0];
03 b -- c;
04 d -- e;
05 e -- f;
06 a -- d;
07 }

Dot Viewer

Apart from the rather antiquated dotty viewer program included in the Graphviz package, some research in the web will yield Smyrna [10] and ZGRViewer [11] among others.

The former has an "experimental" status, but it shows some promise based on the documentation. ZGRViewer is based on Java program code that so far has not been available for Debian or Ubuntu as a package.

Conclusion

Graphviz includes other features, such as color gradients, as well. If you can't easily find your way around the program and all of its features, taking a look at similar projects might help.

For example, you can check out GraphML [12], the Graph Exchange Language (GXL) [13], and the Scalable Vector Graphics (SVG) format. All three of these tools define structures based on XML formats. The Graphviz package also includes converters to and from dot for GXL. l

Infos

Graphviz: http://www.graphviz.org
Eclipse Public License (EPL): http://en.wikipedia.org/wiki/Eclipse_Public_License
AsciiDoc: http://www.methods.co.nz/asciidoc/
Doxygen: http://www.doxygen.org
Puppet: http://en.wikipedia.org/wiki/Puppet_(software)
Connectivity theory: http://en.wikipedia.org/wiki/Connectivity_(graph_theory)
Associative array: http://en.wikipedia.org/wiki/Associative_array
Debtree: http://collab-maint.alioth.debian.org/debtree/
SQLite: http://www.sqlite.org
Smyrna (PDF), http://www.graphviz.org/pdf/smyrna.pdf
ZGRViewer: http://zvtm.sourceforge.net/zgrviewer.html
GraphML, http://graphml.graphdrawing.org/index.html
Graph Exchange Language (GXL): http://www.gupro.de/GXL/

Ubuntu User