Network Plot with plotly and graphviz

rohola zandie
5 min readDec 1, 2017

--

A few month ago I came across the problem of plotting a graph. My graph consisted of a few dozen of nodes with edges and weights for both nodes and edges. At the first sight nothing was new to me. It was pretty much like other graphs that I dealt with before. But the moment I tried to plot the graph I realized that it’s not a trivial task at all. The most important problem is that:

What is the best way to arrange nodes in a two or three dimensional space?

In other words what is “the best layout” to represent the graph? Now, we come to the point that we need to define what we mean by “the best layout”?

First of all, the crossing of edges in the resulting layout should be as minimum as possible: When you try random layout you realize that all the edges cross each other and make a big mess.

Second, position the nodes with high degrees in places that doesn’t make an overall sharp angle: Nodes with high degrees represent important nodes that need to be in the middle of their adjacent nodes.

Third, if the graph itself has a community structure the layout should represent this well: One of the main reasons we need to plot graphs is to simplify their complexity in terms of number of nodes and edges. The best layout can show clusters of nodes that represent the communities in the graph.

with these criterias and more that you can find here we try to plot the graph with best existing tools.

Algorithms for graph drawing

As I mentioned in the previous section, graph drawing is far from being a trivial task: Actually, it is a pretty difficult task in visualization and graph theory. I don’t want to delve into the details of these algorithms because it is out of the scope of this post. you can find out more about the details of these algorithms here.

When you want to manipulate graphs in python you have a lot of options. here is a list of the most widely used graph libraries for python:

1- networkx

2- igraph

3- graph-tool

4- python-graph

Each library has its own benefits and drawbacks. Some of them like graph-tool are based on C++ core (and are faster) and others like networkx are easier to use and are more pythonic.

Almost all the above mentioned packages have implementations of different layouts. The most important ones are:

  1. random layout: As I discussed earlier the worst approach to draw a meaningful graph is using random layout.
  2. circular layout: In this layout we have less crossing but no clustering at all.
  3. spectral layout: Position nodes using the eigenvectors of the graph laplacian. The result of applying this algorithm on the dataset would be:
Applying spectral layout algorithm to dataset

the graph represents the clusters correctly but the distances inside clusters and between clusters don’t result in a good representation.

4. spring layout: Position nodes using Fruchterman-Reingold force-directed algorithm. The result of spring layout with the tuned parameters would be like this:

Applying spring layout algorithm to dataset

The spring layout is much better than previous layouts because it balances the distance between and within clusters and also we can see less crossings compared to circular layout, but, it still behaves like circular layout on each cluster. If you look at the right down cluster this circular pattern is obvious.

Graphviz

As we try to find high quality representations of graphs we need to move to more specialized packages for this purpose. One of the best open source tools on graph drawing is Graphviz. According to their website:

Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. It has important applications in networking, bioinformatics, software engineering, database and web design, machine learning, and in visual interfaces for other technical domains.

Graphviz provides us with more sophisticated layout algorithms that work best on real graphs and give more pleasant and aesthetically gorgeous graphs.

Graphviz is written in C but has wrapper to be used in python. According to their website:

pygraphviz is a Python interface to the Graphviz graph layout and visualization package. With pygraphviz you can create, edit, read, write, and draw graphs using Python to access the Graphviz graph data structure and layout algorithms.

Here we try to use pygraphviz in ubuntu 16.04 and with python 3.6:

First of all we need to install graphviz and its dependencies:

> sudo apt-get install graphviz libgraphviz-dev pkg-config

Then we should install pygraphviz

> pip install pygraphviz

Now you can test your installation with the following command:

> import pygraphviz> print(pygraphviz.__version__)

Now, you have installed pygraphviz, then you can easily use the code in this github repo main method to visualize graph with different methods. First, I load a graph in gpickle format. the reason that I used this format is that it preserve all the properties of the graph including the labels of edges and nodes. You can load any other format but you need to pass netowrkx graph format to the function. There are four options for layout: graphviz, spring, spectral and random(default). The result of graphviz layout can be seen below:

Applying graphviz layout to dataset

As you can see the graphviz layout preserve the community structure of the original graph is a more sensible way. Actually graphviz uses a neato algorithm a variant of spring layout but with a different implementation. This layout is best when you have a limited number of node (less than a few hundreds).

Current version of plotly doesn’t support edge weights as you can see. But the we implement those details in hope for future implementation by plotly.

Another way to visualize the graph is using the spring layout in 3d dimension. I didn’t find a way to use pygraphviz to create 3d version of graphs. The results of using networkx spring layout can be seen below.

For an interactive 3d or 2d plot you need to open the files of this repository in your own browser.

--

--

rohola zandie

I am a PhD student in NLP and Dialog systems, I am curious about mathematics, machine learning, philosophy and languages.