structify_net package
Submodules
structify_net.scoring module
- structify_net.scoring.average_clustering(graph)
Average clustering coefficient of the graph
- Parameters:
graph (nx.Graph) – A graph
- Returns:
Average clustering coefficient, a number between 0 and 1
- Return type:
float
- structify_net.scoring.average_shortest_path_length(graph, normalized=True)
The average shortest path length of the graph
If the graph has a giant component, the average shortest path length is computed on the giant component. If normalized is true, compute a normalized version, defined between 0 and 1, where 1 is the shortest possible path length and 0 is the longest possible path length. It is computed as 1/(normalized_shortest+1), where normalized_shortest is the normalized shortest path length, defined as max(0,(graph_shortest-2)).
- Parameters:
graph (_type_) – _description_
normalized (bool, optional) – _description_. Defaults to True.
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.boundaries(graph, normalized=True)
Boundaries of the graph
The boundaries are computed as the average of the density of the external edges of each community, and the density of the internal edges of each community. If normalized=True, it is normalized to be between 0 and 1.
- Parameters:
graph (_type_) – A graph
normalized (bool, optional) – If True, the boundaries are normalized to be between 0 and 1. Defaults to True.
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.compare_graphs(df_reference, df_graphs, best_by_name=False, score_difference=False)
Compares a list of graphs to a reference graph
Returns a dataframe with the scores of the graphs and the difference to the reference graph
- Parameters:
df_reference (_type_) – The scores of the reference graph as a dataframe
df_graphs (_type_) – The scores of the graphs to compare as a dataframe
best_by_name (bool, optional) – Returns the best graph for each name. Defaults to False.
score_difference (bool, optional) – Returns the difference to the reference graph instead of the scores. Defaults to False.
- Returns:
A dataframe with the scores of the graphs and the difference to the reference graph
- Return type:
_type_
- structify_net.scoring.compute_all_scores(graph)
Compute all scores for a graph
- Parameters:
graph (_type_) – A graph
- Returns:
A dictionary with the scores
- Return type:
pd.DataFrame
- structify_net.scoring.coreness(graph, normalized=True)
Coreness of the graph
- Parameters:
graph (_type_) – A graph
normalized (bool, optional) – IF True, the coreness is normalized to be between 0 and 1. Defaults to True.
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.degree_assortativity(graph, normalized=False, positive_only=True, disassortativity=False)
Degree assortativity coefficient of the graph
The degree assortativity coefficient is computed as the Pearson correlation coefficient between the degrees of the nodes. If normalized=True, it is normalized to be between 0 and 1. If positive_only=True, it is set to 0 if negative. If disassortativity=True, it is set to -1 if positive and 1 if negative.
- Parameters:
graph (_type_) – _description_
normalized (bool, optional) – _description_. Defaults to False.
positive_only (bool, optional) – _description_. Defaults to True.
disassortativity (bool, optional) – _description_. Defaults to False.
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.degree_heterogeneity(graph)
Compute the degree heterogeneity of the graph
- Parameters:
graph (nx.Graph) – A graph
- Returns:
The degree heterogeneity of the graph
- Return type:
float
- structify_net.scoring.get_default_scores(with_size=False, latex_names=True)
Returns the default scores
Returns a dictionary of scores such as {name:score}
- Parameters:
with_size (bool, optional) – If True, the size of the graph is added to the scores. Defaults to False.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to False.
- Returns:
A dictionary of scores such as {name:score}
- Return type:
_type_
- structify_net.scoring.giant_component_ratio(graph)
Ratio of the largest component to the total number of nodes
- Parameters:
graph (nx.Graph) – A graph
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.has_giant_component(graph, threshold=0.9)
Check if the graph has a giant component
Returns True if the ratio of the largest component to the total number of nodes is above the threshold
- Parameters:
graph (nx.Graph) – A graph
threshold (float, optional) – The threshold. Defaults to 0.9.
- Returns:
True if the graph has a giant component
- Return type:
_type_
- structify_net.scoring.hierarchy(graph, normalized=False, positive_only=True)
Hierarchy of the graph
The hierarchy is computed as the Spearman correlation coefficient between the degrees and the clustering coefficients of the nodes. If normalized=True, it is normalized to be between 0 and 1. If positive_only=True, it is set to 0 if negative.
- Parameters:
graph (_type_) – A graph
normalized (bool, optional) – If True, the hierarchy is normalized to be between 0 and 1. Defaults to False.
positive_only (bool, optional) – IF True, the hierarchy is set to 0 if negative. Defaults to True.
- Returns:
hierarchy of the graph
- Return type:
_type_
- structify_net.scoring.is_degree_heterogeneous(graph)
Returns True if the graph is degree heterogeneous, False otherwise
A graph is considered degree heterogeneous if its degree heterogeneity is greater than the average between the degree heterogeneity of a random graph with the same number of nodes and edges, and a power law graph with the same number of nodes and edges.
- structify_net.scoring.modularity(graph, normalized=True)
Returns the modularity of the graph
If normalized=True, it is computed as the difference between the modularity of the partition of highest modularity found by Louvain algorithm on this graph and the modularity of a random graph with the same number of nodes and edges, divided by the difference between the modularity of the random graph and the maximum possible modularity (1). If normalized is false, it simply returns the modularity of the graph.
- Parameters:
graph (_type_) – a graph
normalized (bool, optional) – if True, returns a normalized modularity. Defaults to True.
- Returns:
The modularity of the graph
- Return type:
_type_
- structify_net.scoring.robustness(graph)
Robustness of the graph
Robustness is defined as the percentage of nodes that need to be removed to disconnect the graph. It is computed by removing nodes from the graph, starting from the most connected nodes, until the graph is disconnected. The percentage of nodes removed is then returned.
- Parameters:
graph (_type_) – a graph
- Returns:
robustness of the graph
- Return type:
_type_
- structify_net.scoring.scores_for_generators(generators, scores=None, runs=1, details=False, latex_names=True)
Scores for a list of generators
- Parameters:
generators (_type_) – A dictionary of generators such as {name:generator}
scores (_type_, optional) – a dictionary of scores such as {name:score}. Defaults to None.
runs (int, optional) – Number of runs. Defaults to 1.
details (bool, optional) – If True, the results of each run are returned. Defaults to False.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.
- Returns:
_description_
- Return type:
_type_
- structify_net.scoring.scores_for_graphs(graphs, scores=None, latex_names=True)
Compute scores for a list of graphs
- Parameters:
graphs (_type_) – A dictionary of graphs such as {name:graph}
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.
- structify_net.scoring.scores_for_rank_functions(rank_functions, n, m, scores=None, epsilons=0, runs=1, latex_names=True)
Scores for a list of rank functions
- Parameters:
rank_functions (_type_) – A dictionary of rank functions such as {name:rank_function}
n (_type_) – number of nodes
m (_type_) – number of edges
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
epsilons (int, optional) – A list of epsilons. Defaults to 0.
runs (int, optional) – number of runs. Defaults to 1.
latex_names (bool, optional) – If True, the names of the scores are latex formulas. Defaults to True.
- Returns:
dataframe with the scores
- Return type:
_type_
- structify_net.scoring.scores_for_rank_models(rank_models, m, scores=None, epsilons=0, runs=1, details=False, latex_names=True)
Scores for a list of rank models
- Parameters:
rank_models (_type_) – A dictionary of rank models such as {name:rank_model}
m (_type_) – Number of edges
scores (_type_, optional) – A dictionary of scores such as {name:score}. Defaults to None.
epsilons (int, optional) – A list of epsilons. Defaults to 0.
runs (int, optional) – number of runs. Defaults to 1.
details (bool, optional) – If True, the results of each run are returned. Defaults to False.
latex_names (bool, optional) – IF True, the names of the scores are latex formulas. Defaults to True.
- Returns:
dataframe with the scores
- Return type:
_type_
- structify_net.scoring.transitivity(graph)
Transitivity of the graph
- Parameters:
graph (nx.Graph) – A graph
structify_net.structureClasses module
- class structify_net.structureClasses.Graph_generator(rank_model, probas)
Bases:
objectA graph generator
This class instantiate graph generators. It is composed of a Rank Model Class, and a list of probabilites, such as the index of the probability in the list corresponds to the index of the edge in the rank model.
Example of usage:
n=10 my_model = Rank_model(range(n),lambda u,v: u+v) my_generator = my_model.get_generator(epsilon=0 ,m=30) my_generator.plot_proba_function() my_generator.plot_matrix() g = my_generator.generate() my_generator.scores(scores=[scoring.average_clustering,scoring.clustering],run=2)
- rank_model
A rank model
- Type:
_type_
- sortedPairs
the node pairs, sorted from the most likely to the least likely to be connected, same as in the rank model
- Type:
_type_
- probas
the probabilities of observing an edge, one for each edge, sorted in the same probability as the sortedPairs
- Type:
_type_
- ER(p)
Generate an Erdos-Renyi graph
- Parameters:
n (int) – number of nodes
p (float) – probability of an edge
- Returns:
a graph
- Return type:
nx.Graph
- generate()
Generate a graph based on this generator
- Returns:
a graph
- Return type:
nx.Graph
- plot_matrix(nodeOrder=None, ax=None, **kwargs)
Plot a matrix of the graph generator
The color of the cell is the probability to observe an edge between the two nodes.
- Parameters:
nodeOrder (_type_, optional) – the order in which the nodes should be plotted. Defaults to None.
ax (_type_, optional) – an axis to plot on. Defaults to None.
- plot_proba_function(cumulative=False, ax=None)
Plot the probability function This function plots the probability function of the graph generator. It is a plot of the probability of an edge to exist as a function of the rank of the edge in the rank model.
- Parameters:
ax (_type_, optional) – a matplotlib axis. Defaults to None.
- Returns:
a matplotlib plot
- Return type:
_type_
- scores(scores=None, runs=1, details=False, latex_names=True)
return a score dataframe for this generator
- Parameters:
scores (, optional) – A list of scores to compute. Defaults to None.
runs (int, optional) – nombre de runs pour les scores. Defaults to 1.
details (bool, optional) – if True, return a dataframe with the details for each run, else return a dataframe with the mean. Defaults to False.
latex_names (bool, optional) – if True, use latex names for the scores. Defaults to True.
- Returns:
a dataframe with the scores
- Return type:
pd.DataFrame
- class structify_net.structureClasses.Rank_model(nodes, sort_function, sort_descendent=False, node_order_function=None)
Bases:
objectA class to represent a rank model
A rank model is composed of a list of node pairs, ordered such as the first ones are the most likely to be connected. It can also contain node properties, and a node order, which is the order in which the nodes should be plotted in a plot matrix. It can return a graph generator(provided a proper probability function), which can be used to generate graphs based on this rank model.
Node properties are useful to match the edge order with the structure, it can be used to store the spatial position of nodes or the community membership of nodes, for example.
Example of usage:
n=10 my_model = Rank_model(range(n),lambda u,v: u+v) my_model.plot_matrix() my_generator = my_model.get_generator(epsilon=0 ,m=30) my_model.generate_graph(epsilon=0,m=30)
- sortedPairs
A list of node pairs, ordered such as the first ones are the most likely to be connected
- Type:
_type_
- node_properties
node properties provided as networkx graph or dictionary. Defaults to None.
- Type:
_type_, optional
- node_order
The order in which the nodes should be plotted in a plot matrix. Defaults to None.
- Type:
_type_, optional
- generate_graph(epsilon, density=None, m=None)
Generate a graph from this rank model
This function generates a graph from this rank model and a probability function. The probability function is defined by the epsilon parameter and the density parameter. One of the two parameters density or m should be provided. If both are provided, m will be used.
- Parameters:
epsilon (_type_) – the epsilon parameter of the probability function
density (_type_, optional) – the density parameter of the probability function. Defaults to None.
m (_type_, optional) – the number of edges in the graph. Defaults to None.
- Returns:
_description_
- Return type:
_type_
- get_generator(epsilon, density=None, m=None)
Return a graph generator for this rank model
One of the two parameters density or m should be provided. If both are provided, m will be used.
- Parameters:
epsilon (_type_) – the epsilon parameter of the probability function
density (_type_, optional) – the density parameter of the probability function. Defaults to None.
m (_type_, optional) – the number of edges in the graph. Defaults to None.
- Returns:
_description_
- Return type:
_type_
- plot_matrix(nodeOrder=None, ax=None, **kwargs)
Plot a matrix of the rank model
Node pairs are ordered by rank. The color of the cell is the rank of the edge.
- Parameters:
nodeOrder (_type_, optional) – the order in which the nodes should be plotted. Defaults to None.
ax (_type_, optional) – an axis to plot on. Defaults to None.
- scores(m, scores=None, epsilons=0, runs=1, details=False, latex_names=True)
this function computes scores for a rank model
It computes scores for a rank model, for a given number of edges m. The scores are computed for a range of epsilon values, and for a given number of runs.
- Parameters:
m (_type_) – number of edges in the graph
scores (_type_, optional) – the scores to compute. A dictionary with the score name as key and the score function as value. Defaults to None.
epsilons (int, optional) – either a single value or a list of values for epsilon. Defaults to 0.
runs (int, optional) – number of runs for each epsilon. Defaults to 1.
details (bool, optional) – If True, return a dataframe with the details for each run, else return a dataframe with the mean. Defaults to False.
latex_names (bool, optional) – Whether to use latex names for the scores. Defaults to True.
- Returns:
_description_
- Return type:
_type_
structify_net.transform module
structify_net.viz module
- structify_net.viz.plot_adjacency_matrix(g, nodeOrder=None)
Plot a matrix of the graph
- Parameters
g: a networkx graph nodeOrder: a list of nodes, ordered by the order in which they should appear in the matrix
- structify_net.viz.spider_plot(df, categories=None, reference=None)
Plot a spider plot for each row of df
This plot is used to compare the performance of different models on different scoring functions.
- Parameters:
df (_type_) – The dataframe to plot. Each row is a spider plot. The first column is the name of the spider plot. The other columns are the scoring functions to plot.
categories (_type_, optional) – A list of the scoring functions to plot. Defaults to None, in which case all the columns are plotted.
reference (_type_, optional) – Which line to use as a reference. Defaults to None, in which case no reference is plotted
structify_net.zoo module
- structify_net.zoo.get_all_rank_models(n, m)
Returns a dictionary of all rank models
- Parameters:
n (_type_) – number of nodes
m (_type_) – number of edges. Only used for models with a parameter m
- Returns:
rank_model}
- Return type:
Dictionary of rank models {name
- structify_net.zoo.sort_ER(nodes)
Erdos-Renyi rank model
Returns a rank model based on the Erdos-Renyi model. The Erdos-Renyi model is a random graph model where each pair of nodes is connected with probability p. For a rank model, all pairs of nodes have the same probability.
- Parameters:
nodes (_type_) – describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
- Returns:
structify_net.Rank_model:: The corresponding rank model
- structify_net.zoo.sort_blocks_assortative(nodes, blocks=None)
Rank model based on assortative blocks
This rank model is based on assortative blocks. The blocks are defined by the blocks argument. It can be either a list of lists, where each list is a block, or an integer, in which case the nodes are randomly assigned to blocks.
Edge pairs inside blocks are ordered before edge pairs outside blocks.
- Parameters:
nodes (_type_) – Describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
blocks (_type_, optional) – Blocks definition. Can be either a list of lists, where each list is a block, or an integer, in which case the nodes are randomly assigned to the corresponding number of equal size blocks. Defaults to None.
- Returns:
The corresponding rank model
- Return type:
- structify_net.zoo.sort_core_distance(nodes, dimensions=1, distance='euclidean')
Rank model based on the sum of the distance of their nodes to the center of the space.
Core periphery structure is another well-known type of organization for complex systems. This organization is often modeled using blocks, one block being the dense core, another block, internally sparse, represent the periphery, and the density between the two blocks is set at an intermediate value. To illustrate the flexibility of the Rank approach, we propose a soft-core alternative, the coreness dissolving progressively into a periphery. To do so, we consider nodes embedded into a space, as for the spatial structure –random 1d positions by default. The node-pair rank score is computed as the inverse of the product of 3 distances: the distances from both nodes to the center, and the distance between the two nodes. As a consequence, when two nodes belong to the center, they are very likely to be connected; two nodes far from the center are unlikely to be connected, unless if they are extremely close from each other.
\[[ R'(u,v)=d(W_u,W_v)d(W_u,\mathbf{0})d(W_v,\mathbf{0}) ]\]With \(\mathbf{0}\) the vector corresponding to the center of the location considered as the core of the space.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
dimensions (int, optional) – Number of dimensions. Defaults to 1.
distance (_type_, optional) – Distance function. Defaults to euclidean.
- Returns:
structify_net.Rank_model:: The rank model
- structify_net.zoo.sort_distances(nodes, dimensions=1, distance='euclidean')
Rank model based on the distance between nodes
Also called spatial or geometric network, this rank model is based on the distance between nodes. The distance is defined by the distance function, and the dimensions are defined by the attributes of the nodes. By default, a single dimension is used, and the distance is the euclidean distance.
- Parameters:
nodes (_type_) – describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
dimensions (int, optional) – number of dimensions in which to embed the graph. Defaults to 1.
distance (_type_, optional) – distance function. Defaults to euclidean.
- Returns:
structify_net.Rank_model:: The corresponding rank model
- structify_net.zoo.sort_fractal_hierarchical(nodes, d=3)
Rank model based on a fractal structure
This structure is designed to maximize the hierarchical structure of the network. The network is embedded in a binary tree. The order of the pairs is based on two factors: the distance between nodes in the tree at a same hierarchical level and the distance between the hierarchical levels.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 3. 3 allows to have many triangles.
- Returns:
structify_net.Rank_model:: The rank model
- structify_net.zoo.sort_fractal_leaves(nodes, d=2)
A rank model based on a fractal structure
The order of the pairs is based on the distance between the leaves of a binary tree. The number of leaves is the number of nodes in the network. The distance between two leaves is the number of edges between them in the binary tree.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – Degree of the binary tree. Defaults to 2.
- Returns:
structify_net.Rank_model:: A rank model
- structify_net.zoo.sort_fractal_root(nodes, d=2)
A rank model based on a fractal structure
The network is embedded in a binary tree. The order of the pairs is based on the distance between the nodes in the tree. Nodes are assigned starting from the root of the tree.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 2.
- Returns:
structify_net.Rank_model:: The rank model
- structify_net.zoo.sort_fractal_star(nodes, d=2)
A rank model based on a fractal structure
This structure is designed to maximize the star structure of the network. The network is embedded in a binary tree. The order of the pairs is based on the distance between the hierarchical levels.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
d (int, optional) – degree of the binary tree. Defaults to 2.
- Returns:
structify_net.Rank_model:: rank model
- structify_net.zoo.sort_largest_disconnected_cliques(nodes, m)
A rank model based on the largest disconnected cliques
Computes the largest possible number of cliques of size k such that the number of edges is less than m. Then, the rank model is the same as for assortative blocks.
- Parameters:
nodes (_type_) – Nodes of the graph, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
m (_type_) – number of edges. This is required to compute the largest possible number of cliques.
- Returns:
structify_net.Rank_model:: _description_
- structify_net.zoo.sort_nestedness(nodes)
Rank model based on nestedness
A nestedness network is a particular type of network where the nodes are organized in a hierarchy. Top nodes are connected to bottom nodes. Bottom nodes are connected to other bottom nodes. The order of the pairs is based on the distance between the nodes in the hierarchy.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
- Returns:
structify_net.Rank_model:: The rank model
- structify_net.zoo.sort_overlap_communities(nodes, blocks=None)
Rank model based on overlapping communities
This rank model is based on overlapping communities. The communities are defined by the blocks argument. It can be either a list of lists, where each list is a community, or an integer corresponding to the number of communities. In the latter case, each node belongs to two communities, and the affiliations are chosen such as each community has half of its nodes shared with another community c1 and the other half shared with another community c2.
- Parameters:
nodes (_type_) – Describe nodes of the graphs, either as a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
blocks (_type_, optional) – Describe communities. Can be either a list of lists, where each list is a community, or an integer. Defaults to None.
- Returns:
structify_net.Rank_model:: A rank model
- structify_net.zoo.sort_perlin_noise(nodes=10, octaves=None)
Rank model based on Perlin noise
Perline noise is a type of gradiant noise frequently used in computer graphics to create images with a realistic feel, such as textures and landscapes. we use it to generate an adjacency matrix, from the upper-triangle of a 2d image having as many pixels as there are nodes in the graph. The $R’$ rank score is the black intensity of the pixel. In practice, Perlin noise tends to create continuous shapes of lower and higher values. As with SBM, this generator tends to create stronger relations between some groups of nodes with some other groups of nodes, although the groups are fuzzy, and not necessarily assortative. Perlin noise has a parameter, called octaves, allowing to add smaller scale structures on top of each other.
- Parameters:
nodes (int, optional) – _description_. Defaults to 10.
octaves (_type_, optional) – Octave parameter of the Perlin noise. The higher the value, the finer the structure. Defaults to None, meaning octaves = int(ln(n))
- Returns:
structify_net.Rank_model:: The corresponding rank model
- structify_net.zoo.sort_spatial_WS(nodes, k=10)
Rank model based on a spatial Watts-Strogatz model
This rank model reproduce the original Watts-Strogatz model. Each node is connected to its k nearest neighbors in a ring topology.
- Parameters:
nodes (_type_) – A networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
k (int, optional) – Number of nearest neighbors. Defaults to 10.
- Returns:
structify_net.Rank_model:: The rank model
- structify_net.zoo.sort_stars(nodes)
A rank model based on a star structure
This rank model sorts pairs of nodes such as all pairs of nodes of node n1 are first, then all pairs of nodes of node n2, etc.
- Parameters:
nodes (_type_) – Describe nodes. Can be either a networkx graph (node names and node attributes are preserved) or an integer (number of nodes)
- Returns:
structify_net.Rank_model:: The rank model