Thursday, June 23, 2016

Note 5: Can visualization replace analysis?

What is the most central node in the graph on the left,
what is the most central node in the graph on the right?



I always show this figure in my lecture. Then I ask: "What is the most central node in the left graph? What is the most central node in the right graph?" For the first question, it is almost impossible to suppress the urge to shout out: "The one in the middle is!". I think that this is because we humans are used to put the most important thing in the middle, where our (physical) focus is. We believe that if someone puts a thing in the middle, there must be meaning assigned to this. However, in a network visualization, it does not need to be. However, looking closely at the two visualizations, you will find that they display the same graph, i.e., the connections are absolutely the same. And on the right hand visualization, it becomes obvious that every node is interchangeable with every other node (in graph theoretical terms: the nodes are in an automorphism class).  All nodes thus have the same centrality in the network.

Of course, most algorithms try to put nodes in the middle of their neighbors. This already indicates that a node which is at the center of a network (in the sense of smallest closeness) might also be in the center of the visualization of the network. However, there are also other aesthetic considerations, such as the overall ratio of the resulting figure, or edge crossing minimization.

Looking at a visualization of a network is always a good idea. Being inspired by the visualization and creating a new hypothesis about the structure of it, is absolutely helpful in the first stages of any network analytic project. However, the hypothesis needs to be tested by a quantifiable method - not by another visualization.

"Note 5. A visualization of a network can be both revealing and deceiving. This is why Gephi, yEd, and other visualization tools are perfect for exploration and hypothesis building; it is also the reason why statistical software packages or self-tailored applications are needed to collect quantifiable evidence that a given hypothesis is true." (Zweig2016)

Reference:

(Zweig2016) Katharina A. Zweig: Network Analysis Literacy, ISBN 978-3-7091-0740-9, Springer Vienna, publication expected Dec 2016

No comments:

Post a Comment