Reducing redundant information to find simplifying patterns in data sets and complex networks is a scientific challenge in many knowledge fields. Moreover, detecting the dimensionality of the data is still a hard-to-solve problem. An article published in the journal Nature Communications presents a method to infer the dimensionality of complex networks through the application of hyperbolic geometrics, which capture the complexity of relational structures of the real world in many diverse domains.
Among the authors of the study are the researchers M. Ángeles Serrano and Marián Boguñá, from the Faculty of Physics and the Institute of Complex Systems of the UB (UBICS), and Pedro Almargo, from the Higher Technical School of Engineering of the University of Seville. The research study provides a multidimensional hyperbolic model of complex networks that reproduces its connectivity, with an ultra-low and customizable dimensionality for each specific network. This enables a better characterization of its structure — eg at a community scale — and the improvement of its predictive capability.
The study reveals unexpected regularities, such as the extremely low dimensions of molecular networks associated with biological tissues; the slightly higher dimensionality required by social networks and the Internet; and the discovery that brain connects are close to three dimensions in their automatic organization.
hyperbolic versus Euclidean geometry
The intrinsic geometry of data sets or complex networks is not obvious, which becomes an obstacle in determining the dimensionality of real networks. Another challenge is that the definition of distance has to be established according to their relational and connectivity structure, and this also requires sophisticated models.
Now, the new approach is based on the geometry of complex networks, and more specifically, on the configurational geometric model or SD model. “This model, which we have developed in previous work, describes the structure of complex networks based on fundamental principles,” says the lecturer M. Ángeles, ICREA researcher at the Department of Condensed Matter Physics of the UB.
“More specifically — he continues — , the model postulates a law of interconnection of the network elements (or nodes) that is gravitational, so nodes that are closer in a similarity space — of spherical geometry in D dimensions — and with more popularity — an extra dimension corresponding to the importance of the node — are more likely to establish connections.”
In the study, the similarity and popularity variables are combined to give rise to the hyperbolic geometry of the model, which emerges as the natural geometry representing the hierarchical architecture of complex networks.
In previous studies, the team had applied the simplest version of the one-dimensional SD model — the S1 model — to explain many typical features of real-world networks: the small-world property (the six degrees of separation), the heterogeneous distributions of the number of neighbors per node, and the high levels of transitive relationships (triangle connections that can be illustrated with the expression my friend’s friend is also my friend).
“In addition, the application of statistical inference techniques allows us to obtain real network maps in the hyperbolic plan that are congruent with the established model,” she says. “Beyond visualization, these representations have been used in a multitude of tasks, including efficient navigation methods, the detection of self-similarity patterns, the detection of strongly interacting communities of nodes, and the implementation of a network renormalization procedure that reveals hidden symmetries in the multi-scale organization of complex networks and allows the production of network replicas at reduced or enlarged scales.”
Now, the team infers the dimensionality of the hyperbolic space underlying the real networks from properties that relate to the dimension of their geometry. In particular, the work measures the statistics of higher-order cycles (triangles, squares, pentagons) associated with the connections.
A methodology applicable to all complex networks
In computer science, the applied techniques are based on data that typically make definitions of similarity distance between their elements, an approach that involves the construction of graphs that are mapped onto a latent space of Euclidean features.
“Our estimates of the dimensionality of complex networks are well below our estimates based on Euclidean space, since hyperbolic space is better suited to represent the hierarchical structure of real complex networks. For example, the Internet only requires D = 7 dimensions to be mapped into the hyperbolic space of our model, whereas this name is multiplied by six and scales to D = 47 in one of the most recent techniques using Euclidean space,” says Professor Marián Boguñá.
In addition, techniques for mapping complex data usually assume a latent space, with a predetermined name of dimensions, or implement heuristic techniques to find a suitable value. Thus, the new method is based on a model that does not need the spatial mapping of the network to determine the dimension of its geometry.
In the field of network science, many methodologies use the shortest distances to study the connectivity structure of the network (shortest paths) as a metric space. However, these distances are strongly affected by the small-world property and do not provide a wide range of distance values.
“Our model uses a completely different definition of distance based on an underlying hyperbolic space, and we do not need to map the network. Our methodology is applicable to any real network or data series with complex structure and with a size that is typically thousands or tens of thousands of nodes but can reach hundreds of thousands in a reasonable computational time,” says M. Ángeles Serrano.
What is the real dimensionality of social networks and the Internet?
Social networks and the Internet is higher (between 6 and 9) compared to networks in other domains, according to the study’s findings. However, it is still very low — 6 to 7 times lower — compared to that obtained by other methods. This reflects the fact that interactions in these systems are more complex and determined by a greater variety of factors.
On the other hand, friendship-based social networks are at the top of the dimensionality ranking. “This is an unexpected result, since one might think that friendship is a freer type of affective relationship, but our results link to the fact that homophily in human interactions is determined by a multitude of sociological factors such as age, gender, social class, beliefs, attitudes or interests,” says M. Ángeles Serrano.
In the case of the Internet, even though it is a technological network, its greater dimensionality reflects the fact that for an autonomous system, connecting does not mean only accessing the system, as one might think at first. On the contrary, many different factors influence the formation of these connections, and as a consequence, a variety of other relationships may be present (eg, supplier-client, peer-to-peer, exchange-based peering, etc.).
“What is really surprising, both for social networks and the internet, is that our theoretical framework — which does not use any annotations about connections beyond their existence — is able to capture this multidimensional reality that is not explicit in our data,” concludes the team, which is currently working on constructing hyperbolic multidimensional maps of complex networks that are congruent with the theoretical framework established by the SD model.