Saturday, June 4, 2016

Note 1: Trilemma of Complex Network Analysis

I started my doctoral studies over 13 years ago - and it seemed that complex network analysis is the framework to use whenever you can define a meaningful relationship between a set of entities: proteins interacting with each other, airports connected by scheduled flights, people connected by pressing a 'retweet' button in their browser to express their opinion on someone else's tweet.

The main hypothesis and first Note in my upcoming book "Network Analysis Literacy" (Zweig2016) condenses my findings of these more than a dozen years:

Note 1. "To interpret the values of a distance-based measure, the way of calculating the distance must be matched to the process of interest. To interpret any walk-based measure, the set of walks used by the measure needs to be closely adapted to the process. (K.A. Zweig: Network Analysis Literacy, (c) by Springer Verlag, Heidelberg, to be published)

It refers to the so-called trilemma of complex network analysis, a term Isadory Dorn, Andreas Lindenblatt, and I developed in 2012 (Dorn2012). It summarizes the interdependencies between raw data, relationship of interest, network process of interest, research question, and methods used to analyze the latter. The research question determines a network process that uses a (set of) relationships to exert indirect effects on entities connected to each other by this relationship. By representing this relationship as a complex network, all classic network analytic methods can be applied---in principle.
Stephen P. Borgatti  was the first to show that centrality indices, one of the most classic and widely used set of methods, have an inbuilt model of a network process they are associated with (Borgatti2005): they secretely determine the paths on which indirect effects are induced! Take your beloved betweenness centrality, defined as follows:


where $\delta_v(s,t)$ refers to the number of shortest paths between s and t, containing v, and $\delta(s,t)$ refers to the number of all shortest paths between s and t. There will be another dedicated blog entry to the implicit assumptions the betweenness centrality makes, but here it suffices to say that it assumes the following: all entities want to interact with each other in the same intensity (all pairs of s and t are treated equally) and all of them interact on shortest paths.

If your network process of interest does not follow these two assumptions, the betweenness centrality might not be the best centrality index to answer your research question. 

This is just the first example of how network analysis literacy, e.g., knowing the implicit models behind your favourite network analytic measure or the relationship between research question, network process, and relationship, may help you to make well-grounded choices, 

References:

(Borgatti2005) Borgatti, S. P.: "Centrality and Network Flow", Social Networks, 2005, 27, 55-71

(Dorn2012) Dorn, I.; Lindenblatt, A. & Zweig, K. A.: "The Trilemma of Network Analysis", Proceedings of the 2012 IEEE/ACM international conference on Advances in Social Network Analysis and Mining, Istanbul, 2012

(Zweig2016) Katharina A. Zweig: Network Analysis Literacy, ISBN 978-3-7091-0740-9, Springer Vienna, publication expected Dec 2016

No comments:

Post a Comment