The hop plot is a visualization of the distribution of pairwise distances in a network. It shows the number of nodes one can reach (on average) by going N steps in a network. Here’s an example for the US electrical power grid network:
What kind of distribution do these hop plots follow? Maybe a normal distribution? A logistic distribution? We could do all kinds of statistical tests to find out, but I want to do something different. I want to answer that question visually. Therefore, we’re going to use different ways of drawing the Y axis of the plot that correspond to each distribution. Let me explain: If we apply the inverse cumulative distribution function to the values on the Y axis, the plot should show a straight line if the values follows that distribution. Since the values on the Y axis are between 0 and 1 and have the shape of a sigmoid function (see Wikipedia), we’re going to try out the various sigmoid functions that we know.
(1) Normal distribution – Inverse error function
The cumulative distribution function of the normal distribution is the error function. Let’s try it:
(2) Logistic distribution
The cumulative distribution function of the logistic distribution is the logistic function:
(3) The tangent sigmoid
We’ll use the arctangent function as our (scaled) cumulative distribution function:
We’ll use the hyperbolic tangent as our (scaled) cumulative distribution function:
In conclusion, none of the functions fit. Therefore, the distances in the US power grid do not follow any of the simple sigmoid models.
Two notes are in order though:
- A plot is not a statistical test. You should do an actual statistic test when you write a paper!
- Does anyone known what the usual network models (Erdős–Rényi, Barabási–Albert, Watts–Strogatz, etc.) result in?