structify_net package

Submodules

structify_net.scoring module

structify_net.scoring.average_clustering(graph)

Average clustering coefficient of the graph

Parameters:: graph (nx.Graph) – A graph
Returns:: Average clustering coefficient, a number between 0 and 1
Return type:: float

structify_net.scoring.average_shortest_path_length(graph, normalized=True)

The average shortest path length of the graph

If the graph has a giant component, the average shortest path length is computed on the giant component. If normalized is true, compute a normalized version, defined between 0 and 1, where 1 is the shortest possible path length and 0 is the longest possible path length. It is computed as 1/(normalized_shortest+1), where normalized_shortest is the normalized shortest path length, defined as max(0,(graph_shortest-2)).

Parameters:

graph (_type_) – _description_
normalized (bool, optional) – _description_. Defaults to True.

Returns:

_description_

Return type:

_type_

structify_net.scoring.boundaries(graph, normalized=True)

Boundaries of the graph

The boundaries are computed as the average of the density of the external edges of each community, and the density of the internal edges of each community. If normalized=True, it is normalized to be between 0 and 1.

Parameters:

graph (_type_) – A graph
normalized (bool, optional) – If True, the boundaries are normalized to be between 0 and 1. Defaults to True.

Returns:

_description_

Return type:

_type_

structify_net.scoring.compare_graphs(df_reference, df_graphs, best_by_name=False, score_difference=False)

Compares a list of graphs to a reference graph

Returns a dataframe with the scores of the graphs and the difference to the reference graph

Parameters:

df_reference (_type_) – The scores of the reference graph as a dataframe
df_graphs (_type_) – The scores of the graphs to compare as a dataframe
best_by_name (bool, optional) – Returns the best graph for each name. Defaults to False.
score_difference (bool, optional) – Returns the difference to the reference graph instead of the scores. Defaults to False.

Returns:

A dataframe with the scores of the graphs and the difference to the reference graph

Return type:

_type_

structify_net.scoring.compute_all_scores(graph)

Compute all scores for a graph

Parameters:: graph (_type_) – A graph
Returns:: A dictionary with the scores
Return type:: pd.DataFrame

structify_net.scoring.coreness(graph, normalized=True)

Coreness of the graph

Parameters:

graph (_type_) – A graph
normalized (bool, optional) – IF True, the coreness is normalized to be between 0 and 1. Defaults to True.

Returns:

_description_

Return type:

_type_

structify_net.scoring.degree_assortativity(graph, normalized=False, positive_only=True, disassortativity=False)

Degree assortativity coefficient of the graph

The degree assortativity coefficient is computed as the Pearson correlation coefficient between the degrees of the nodes. If normalized=True, it is normalized to be between 0 and 1. If positive_only=True, it is set to 0 if negative. If disassortativity=True, it is set to -1 if positive and 1 if negative.

Parameters:

graph (_type_) – _description_
normalized (bool, optional) – _description_. Defaults to False.
positive_only (bool, optional) – _description_. Defaults to True.
disassortativity (bool, optional) – _description_. Defaults to False.

Returns:

_description_

Return type:

_type_

structify_net.scoring.degree_heterogeneity(graph)

Compute the degree heterogeneity of the graph

Parameters:: graph (nx.Graph) – A graph
Returns:: The degree heterogeneity of the graph
Return type:: float

structify_net.scoring.get_default_scores(with_size=False, latex_names=True)

Returns the default scores

Returns a dictionary of scores such as {name:score}

Parameters:

with_size (bool, optional) – If True, the size of the graph is added to the scores. Defaults to False.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to False.

Returns:

A dictionary of scores such as {name:score}

Return type:

_type_

structify_net.scoring.giant_component_ratio(graph)

Ratio of the largest component to the total number of nodes

Parameters:: graph (nx.Graph) – A graph
Returns:: _description_
Return type:: _type_

structify_net.scoring.has_giant_component(graph, threshold=0.9)

Check if the graph has a giant component

Returns True if the ratio of the largest component to the total number of nodes is above the threshold

Parameters:

graph (nx.Graph) – A graph
threshold (float, optional) – The threshold. Defaults to 0.9.

Returns:

True if the graph has a giant component

Return type:

_type_

structify_net.scoring.hierarchy(graph, normalized=False, positive_only=True)

Hierarchy of the graph

The hierarchy is computed as the Spearman correlation coefficient between the degrees and the clustering coefficients of the nodes. If normalized=True, it is normalized to be between 0 and 1. If positive_only=True, it is set to 0 if negative.

Parameters:

graph (_type_) – A graph
normalized (bool, optional) – If True, the hierarchy is normalized to be between 0 and 1. Defaults to False.
positive_only (bool, optional) – IF True, the hierarchy is set to 0 if negative. Defaults to True.

Returns:

hierarchy of the graph

Return type:

_type_

structify_net.scoring.is_degree_heterogeneous(graph)

Returns True if the graph is degree heterogeneous, False otherwise

A graph is considered degree heterogeneous if its degree heterogeneity is greater than the average between the degree heterogeneity of a random graph with the same number of nodes and edges, and a power law graph with the same number of nodes and edges.

structify_net.scoring.modularity(graph, normalized=True)

Returns the modularity of the graph

If normalized=True, it is computed as the difference between the modularity of the partition of highest modularity found by Louvain algorithm on this graph and the modularity of a random graph with the same number of nodes and edges, divided by the difference between the modularity of the random graph and the maximum possible modularity (1). If normalized is false, it simply returns the modularity of the graph.

Parameters:

graph (_type_) – a graph
normalized (bool, optional) – if True, returns a normalized modularity. Defaults to True.

Returns:

The modularity of the graph

Return type:

_type_

structify_net.scoring.robustness(graph)

Robustness of the graph

Robustness is defined as the percentage of nodes that need to be removed to disconnect the graph. It is computed by removing nodes from the graph, starting from the most connected nodes, until the graph is disconnected. The percentage of nodes removed is then returned.

Parameters:: graph (_type_) – a graph
Returns:: robustness of the graph
Return type:: _type_

structify_net.scoring.scores_for_generators(generators, scores=None, runs=1, details=False, latex_names=True)

Scores for a list of generators

Parameters:

generators (_type_) – A dictionary of generators such as {name:generator}
scores (_type_, optional) – a dictionary of scores such as {name:score}. Defaults to None.
runs (int, optional) – Number of runs. Defaults to 1.
details (bool, optional) – If True, the results of each run are returned. Defaults to False.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.

Returns:

_description_

Return type:

_type_

structify_net.scoring.scores_for_graphs(graphs, scores=None, latex_names=True)

Compute scores for a list of graphs

Parameters:

graphs (_type_) – A dictionary of graphs such as {name:graph}
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.

structify_net.scoring.scores_for_rank_functions(rank_functions, n, m, scores=None, epsilons=0, runs=1, latex_names=True)

Scores for a list of rank functions

Parameters:

rank_functions (_type_) – A dictionary of rank functions such as {name:rank_function}
n (_type_) – number of nodes
m (_type_) – number of edges
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
epsilons (int, optional) – A list of epsilons. Defaults to 0.
runs (int, optional) – number of runs. Defaults to 1.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.

Returns:

dataframe with the scores

Return type:

_type_

structify_net.scoring.scores_for_rank_models(rank_models, m, scores=None, epsilons=0, runs=1, details=False, latex_names=True)

Scores for a list of rank models

Parameters:

rank_models (_type_) – A dictionary of rank models such as {name:rank_model}
m (_type_) – Number of edges
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
epsilons (int, optional) – A list of epsilons. Defaults to 0.
runs (int, optional) – number of runs. Defaults to 1.
details (bool, optional) – If True, the results of each run are returned. Defaults to False.
latex_names (bool, optional) – IF True, the names of the scores are latex formulas. Defaults to True.

Returns:

dataframe with the scores

Return type:

_type_

structify_net.scoring.transitivity(graph)

Transitivity of the graph

Parameters:: graph (nx.Graph) – A graph

structify_net.structureClasses module

class structify_net.structureClasses.Graph_generator(rank_model, probas)

Bases: object

A graph generator

This class instantiate graph generators. It is composed of a Rank Model Class, and a list of probabilites, such as the index of the probability in the list corresponds to the index of the edge in the rank model.

Example of usage:

n=10
my_model = Rank_model(range(n),lambda u,v: u+v)
my_generator = my_model.get_generator(epsilon=0 ,m=30)
my_generator.plot_proba_function()
my_generator.plot_matrix()
g = my_generator.generate()
my_generator.scores(scores=[scoring.average_clustering,scoring.clustering],run=2)

rank_model

A rank model

Type:: _type_

sortedPairs

the node pairs, sorted from the most likely to the least likely to be connected, same as in the rank model

Type:: _type_

probas

the probabilities of observing an edge, one for each edge, sorted in the same probability as the sortedPairs

Type:: _type_

ER(p)

Generate an Erdos-Renyi graph

Parameters:

n (int) – number of nodes
p (float) – probability of an edge

Returns:

a graph

Return type:

nx.Graph

generate()

Generate a graph based on this generator

Returns:: a graph
Return type:: nx.Graph

plot_matrix(nodeOrder=None, ax=None, **kwargs)

Plot a matrix of the graph generator

The color of the cell is the probability to observe an edge between the two nodes.

Parameters:

nodeOrder (_type_, optional) – the order in which the nodes should be plotted. Defaults to None.
ax (_type_, optional) – an axis to plot on. Defaults to None.

plot_proba_function(cumulative=False, ax=None)

Plot the probability function This function plots the probability function of the graph generator. It is a plot of the probability of an edge to exist as a function of the rank of the edge in the rank model.

Parameters:: ax (_type_, optional) – a matplotlib axis. Defaults to None.
Returns:: a matplotlib plot
Return type:: _type_

scores(scores=None, runs=1, details=False, latex_names=True)

return a score dataframe for this generator

Parameters:

scores (, optional) – A list of scores to compute. Defaults to None.
runs (int, optional) – nombre de runs pour les scores. Defaults to 1.
details (bool, optional) – if True, return a dataframe with the details for each run, else return a dataframe with the mean. Defaults to False.
latex_names (bool, optional) – if True, use latex names for the scores. Defaults to True.

Returns:

a dataframe with the scores

Return type:

pd.DataFrame

class structify_net.structureClasses.Rank_model(nodes, sort_function, sort_descendent=False, node_order_function=None)

Bases: object

A class to represent a rank model

A rank model is composed of a list of node pairs, ordered such as the first ones are the most likely to be connected. It can also contain node properties, and a node order, which is the order in which the nodes should be plotted in a plot matrix. It can return a graph generator(provided a proper probability function), which can be used to generate graphs based on this rank model.

Node properties are useful to match the edge order with the structure, it can be used to store the spatial position of nodes or the community membership of nodes, for example.

Example of usage:

n=10
my_model = Rank_model(range(n),lambda u,v: u+v)
my_model.plot_matrix()
my_generator = my_model.get_generator(epsilon=0 ,m=30)
my_model.generate_graph(epsilon=0,m=30)

sortedPairs

A list of node pairs, ordered such as the first ones are the most likely to be connected

Type:: _type_

node_properties

node properties provided as networkx graph or dictionary. Defaults to None.

Type:: _type_, optional

node_order

The order in which the nodes should be plotted in a plot matrix. Defaults to None.

Type:: _type_, optional

generate_graph(epsilon, density=None, m=None)

Generate a graph from this rank model

This function generates a graph from this rank model and a probability function. The probability function is defined by the epsilon parameter and the density parameter. One of the two parameters density or m should be provided. If both are provided, m will be used.

Parameters:

epsilon (_type_) – the epsilon parameter of the probability function
density (_type_, optional) – the density parameter of the probability function. Defaults to None.
m (_type_, optional) – the number of edges in the graph. Defaults to None.

Returns:

_description_

Return type:

_type_

get_generator(epsilon, density=None, m=None)

Return a graph generator for this rank model

One of the two parameters density or m should be provided. If both are provided, m will be used.

Parameters:

epsilon (_type_) – the epsilon parameter of the probability function
density (_type_, optional) – the density parameter of the probability function. Defaults to None.
m (_type_, optional) – the number of edges in the graph. Defaults to None.

Returns:

_description_

Return type:

_type_

plot_matrix(nodeOrder=None, ax=None, **kwargs)

Plot a matrix of the rank model

Node pairs are ordered by rank. The color of the cell is the rank of the edge.

Parameters:

nodeOrder (_type_, optional) – the order in which the nodes should be plotted. Defaults to None.
ax (_type_, optional) – an axis to plot on. Defaults to None.

scores(m, scores=None, epsilons=0, runs=1, details=False, latex_names=True)

this function computes scores for a rank model

It computes scores for a rank model, for a given number of edges m. The scores are computed for a range of epsilon values, and for a given number of runs.

Parameters:

m (_type_) – number of edges in the graph
scores (_type_, optional) – the scores to compute. A dictionary with the score name as key and the score function as value. Defaults to None.
epsilons (int, optional) – either a single value or a list of values for epsilon. Defaults to 0.
runs (int, optional) – number of runs for each epsilon. Defaults to 1.
details (bool, optional) – If True, return a dataframe with the details for each run, else return a dataframe with the mean. Defaults to False.
latex_names (bool, optional) – Whether to use latex names for the scores. Defaults to True.

Returns:

_description_

Return type:

_type_

structify_net.transform module

structify_net.viz module

structify_net.viz.plot_adjacency_matrix(g, nodeOrder=None)

Plot a matrix of the graph

Parameters: g: a networkx graph nodeOrder: a list of nodes, ordered by the order in which they should appear in the matrix

structify_net.viz.spider_plot(df, categories=None, reference=None)

Plot a spider plot for each row of df

This plot is used to compare the performance of different models on different scoring functions.

Parameters:

df (_type_) – The dataframe to plot. Each row is a spider plot. The first column is the name of the spider plot. The other columns are the scoring functions to plot.
categories (_type_, optional) – A list of the scoring functions to plot. Defaults to None, in which case all the columns are plotted.
reference (_type_, optional) – Which line to use as a reference. Defaults to None, in which case no reference is plotted

structify_net.zoo module

structify_net.zoo.get_all_rank_models(n, m)

Returns a dictionary of all rank models

Parameters:

n (_type_) – number of nodes
m (_type_) – number of edges. Only used for models with a parameter m

Returns:

rank_model}

Return type:

Dictionary of rank models {name

structify_net.zoo.sort_ER(nodes)

Erdos-Renyi rank model

Returns a rank model based on the Erdos-Renyi model. The Erdos-Renyi model is a random graph model where each pair of nodes is connected with probability p. For a rank model, all pairs of nodes have the same probability.

Parameters:: nodes (_type_) – describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
Returns:: structify_net.Rank_model:: The corresponding rank model

structify_net.zoo.sort_blocks_assortative(nodes, blocks=None)

Rank model based on assortative blocks

This rank model is based on assortative blocks. The blocks are defined by the blocks argument. It can be either a list of lists, where each list is a block, or an integer, in which case the nodes are randomly assigned to blocks.

Edge pairs inside blocks are ordered before edge pairs outside blocks.

Parameters:

nodes (_type_) – Describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
blocks (_type_, optional) – Blocks definition. Can be either a list of lists, where each list is a block, or an integer, in which case the nodes are randomly assigned to the corresponding number of equal size blocks. Defaults to None.

Returns:

The corresponding rank model

Return type:

structify_net.Rank_model

structify_net.zoo.sort_core_distance(nodes, dimensions=1, distance='euclidean')

Rank model based on the sum of the distance of their nodes to the center of the space.

Core periphery structure is another well-known type of organization for complex systems. This organization is often modeled using blocks, one block being the dense core, another block, internally sparse, represent the periphery, and the density between the two blocks is set at an intermediate value. To illustrate the flexibility of the Rank approach, we propose a soft-core alternative, the coreness dissolving progressively into a periphery. To do so, we consider nodes embedded into a space, as for the spatial structure –random 1d positions by default. The node-pair rank score is computed as the inverse of the product of 3 distances: the distances from both nodes to the center, and the distance between the two nodes. As a consequence, when two nodes belong to the center, they are very likely to be connected; two nodes far from the center are unlikely to be connected, unless if they are extremely close from each other.

\[[ R'(u,v)=d(W_u,W_v)d(W_u,\mathbf{0})d(W_v,\mathbf{0}) ]\]

With $\mathbf{0}$ the vector corresponding to the center of the location considered as the core of the space.

Parameters:

nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
dimensions (int, optional) – Number of dimensions. Defaults to 1.
distance (_type_, optional) – Distance function. Defaults to euclidean.

Returns:

structify_net.Rank_model:: The rank model

structify_net.zoo.sort_distances(nodes, dimensions=1, distance='euclidean')

Rank model based on the distance between nodes

Also called spatial or geometric network, this rank model is based on the distance between nodes. The distance is defined by the distance function, and the dimensions are defined by the attributes of the nodes. By default, a single dimension is used, and the distance is the euclidean distance.

Parameters:

nodes (_type_) – describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
dimensions (int, optional) – number of dimensions in which to embed the graph. Defaults to 1.
distance (_type_, optional) – distance function. Defaults to euclidean.

Returns:

structify_net.Rank_model:: The corresponding rank model

structify_net.zoo.sort_fractal_hierarchical(nodes, d=3)

Rank model based on a fractal structure

This structure is designed to maximize the hierarchical structure of the network. The network is embedded in a binary tree. The order of the pairs is based on two factors: the distance between nodes in the tree at a same hierarchical level and the distance between the hierarchical levels.

Parameters:

nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 3. 3 allows to have many triangles.

Returns:

structify_net.Rank_model:: The rank model

structify_net.zoo.sort_fractal_leaves(nodes, d=2)

A rank model based on a fractal structure

The order of the pairs is based on the distance between the leaves of a binary tree. The number of leaves is the number of nodes in the network. The distance between two leaves is the number of edges between them in the binary tree.

Parameters:

nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – Degree of the binary tree. Defaults to 2.

Returns:

structify_net.Rank_model:: A rank model

structify_net.zoo.sort_fractal_root(nodes, d=2)

A rank model based on a fractal structure

The network is embedded in a binary tree. The order of the pairs is based on the distance between the nodes in the tree. Nodes are assigned starting from the root of the tree.

Parameters:

nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 2.

Returns:

structify_net.Rank_model:: The rank model

structify_net.zoo.sort_fractal_star(nodes, d=2)

A rank model based on a fractal structure

This structure is designed to maximize the star structure of the network. The network is embedded in a binary tree. The order of the pairs is based on the distance between the hierarchical levels.

Parameters:

nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 2.

Returns:

structify_net.Rank_model:: rank model

structify_net.zoo.sort_largest_disconnected_cliques(nodes, m)

A rank model based on the largest disconnected cliques

Computes the largest possible number of cliques of size k such that the number of edges is less than m. Then, the rank model is the same as for assortative blocks.

Parameters:

nodes (_type_) – Nodes of the graph, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
m (_type_) – number of edges. This is required to compute the largest possible number of cliques.

Returns:

structify_net.Rank_model:: _description_

structify_net.zoo.sort_nestedness(nodes)

Rank model based on nestedness

A nestedness network is a particular type of network where the nodes are organized in a hierarchy. Top nodes are connected to bottom nodes. Bottom nodes are connected to other bottom nodes. The order of the pairs is based on the distance between the nodes in the hierarchy.

Parameters:: nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
Returns:: structify_net.Rank_model:: The rank model

structify_net.zoo.sort_overlap_communities(nodes, blocks=None)

Rank model based on overlapping communities

This rank model is based on overlapping communities. The communities are defined by the blocks argument. It can be either a list of lists, where each list is a community, or an integer corresponding to the number of communities. In the latter case, each node belongs to two communities, and the affiliations are chosen such as each community has half of its nodes shared with another community c1 and the other half shared with another community c2.

Parameters:

nodes (_type_) – Describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
blocks (_type_, optional) – Describe communities. Can be either a list of lists, where each list is a community, or an integer. Defaults to None.

Returns:

structify_net.Rank_model:: A rank model

structify_net.zoo.sort_perlin_noise(nodes=10, octaves=None)

Rank model based on Perlin noise

Perline noise is a type of gradiant noise frequently used in computer graphics to create images with a realistic feel, such as textures and landscapes. we use it to generate an adjacency matrix, from the upper-triangle of a 2d image having as many pixels as there are nodes in the graph. The $R’$ rank score is the black intensity of the pixel. In practice, Perlin noise tends to create continuous shapes of lower and higher values. As with SBM, this generator tends to create stronger relations between some groups of nodes with some other groups of nodes, although the groups are fuzzy, and not necessarily assortative. Perlin noise has a parameter, called octaves, allowing to add smaller scale structures on top of each other.

Parameters:

nodes (int, optional) – _description_. Defaults to 10.
octaves (_type_, optional) – Octave parameter of the Perlin noise. The higher the value, the finer the structure. Defaults to None, meaning octaves = int(ln(n))

Returns:

structify_net.Rank_model:: The corresponding rank model

structify_net.zoo.sort_spatial_WS(nodes, k=10)

Rank model based on a spatial Watts-Strogatz model

This rank model reproduce the original Watts-Strogatz model. Each node is connected to its k nearest neighbors in a ring topology.

Parameters:

nodes (_type_) – A networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
k (int, optional) – Number of nearest neighbors. Defaults to 10.

Returns:

structify_net.Rank_model:: The rank model

structify_net.zoo.sort_stars(nodes)

A rank model based on a star structure

This rank model sorts pairs of nodes such as all pairs of nodes of node n1 are first, then all pairs of nodes of node n2, etc.

Parameters:: nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
Returns:: structify_net.Rank_model:: The rank model