probinet.utils.tools#

This module contains utility functions for data manipulation and file I/O.

Functions

`build_edgelist`(A, layer)	Build the edgelist for a given layer of an adjacency tensor in DataFrame format.
`can_cast_to_int`(string)	Verify if one object can be converted to integer object.
`check_symmetric`(a[, rtol, atol])	Check if a matrix or a list of matrices is symmetric.
`create_design_matrix`(metadata[, nodeID, ...])	Create the design matrix DataFrame from metadata.
`flt`(x[, d])	Round a number to a specified number of decimal places.
`get_item_array_from_subs`(A, ref_subs)	Retrieves the values of specific entries in a dense tensor.
`get_or_create_rng`([rng])	Set the random seed and initialize the random number generator.
`is_sparse`(X)	Check whether the input tensor is sparse.
`log_and_raise_error`(error_type, message)	Logs an error message and raises an exception of the specified type.
`output_adjacency`(A, out_folder, label)	Save the adjacency tensor to a file.
`save_design_matrix`(X, perc[, folder, fname])	Save the design matrix to file.
`sptensor_from_dense_array`(X)	Create a sparse tensor from a dense array using sparse.COO.
`write_adjacency`(G[, folder, fname, ego, alter])	Save the adjacency tensor to file.
`write_design_matrix`(metadata, perc[, ...])	Save the design matrix to file.

probinet.utils.tools.build_edgelist(A: COO, layer: int) → DataFrame[source]#

Build the edgelist for a given layer of an adjacency tensor in DataFrame format.

Parameters:

A (coo_matrix) – Sparse matrix in COOrdinate format representing the adjacency tensor.
layer (int) – Index of the layer for which the edgelist is to be built.

Returns:

DataFrame containing the edgelist for the specified layer with columns ‘source’, ‘target’, and ‘L<layer>’.

Return type:

pd.DataFrame

probinet.utils.tools.can_cast_to_int(string: int | float | str) → bool[source]#

Verify if one object can be converted to integer object.

Parameters:: string (int or float or str) – Name of the node.
Returns:: bool – If True, the input can be converted to integer object.
Return type:: bool

probinet.utils.tools.check_symmetric(a: ndarray | List[ndarray], rtol: float = 1e-05, atol: float = 1e-08) → bool[source]#

Check if a matrix or a list of matrices is symmetric.

Parameters:

a (ndarray or list) – Input data.
rtol (float) – Relative convergence_tol.
atol (float) – Absolute convergence_tol.

Return type:

True if the matrix is symmetric, False otherwise.

probinet.utils.tools.create_design_matrix(metadata: Dict[str, str], nodeID: str = 'Name', attr_name: str = 'Metadata') → DataFrame[source]#

Create the design matrix DataFrame from metadata.

Parameters:

metadata (dict) – Dictionary where the keys are the node labels and the values are the metadata associated to them.
nodeID (str) – Name of the column with the node labels.
attr_name (str) – Name of the column to consider as attribute.

Returns:

X – Design matrix

Return type:

DataFrame

probinet.utils.tools.flt(x: float, d: int = 3) → float[source]#

Round a number to a specified number of decimal places.

Parameters:

x (float) – Number to be rounded.
d (int) – Number of decimal places to round to.

Returns:

The input number rounded to the specified number of decimal places.

Return type:

float

probinet.utils.tools.get_item_array_from_subs(A: ndarray, ref_subs: Sequence[ndarray]) → ndarray[source]#

Retrieves the values of specific entries in a dense tensor. Output is a 1-d array with dimension = number of non-zero entries.

Parameters:

A (np.ndarray) – The input tensor from which values are to be retrieved.
ref_subs (Tuple[np.ndarray]) – A tuple containing arrays of indices. Each array in the tuple corresponds to indices along one dimension of the tensor.

Returns:

A 1-dimensional array containing the values of the tensor at the specified indices.

Return type:

np.ndarray

probinet.utils.tools.get_or_create_rng(rng: Generator | None = None) → Generator[source]#

Set the random seed and initialize the random number generator.

Parameters:: rng (Optional[np.random.Generator]) – Random number generator. If None, a new generator is created using the seed.
Returns:: Initialized random number generator.
Return type:: np.random.Generator

probinet.utils.tools.is_sparse(X: ndarray) → bool[source]#

Check whether the input tensor is sparse. It implements a heuristic definition of sparsity. A tensor is considered sparse if: given M = number of modes S = number of entries I = number of non-zero entries then N > M(I + 1)

Parameters:: X (ndarray) – Input data.
Returns:: Boolean flag
Return type:: true if the input tensor is sparse, false otherwise.

probinet.utils.tools.log_and_raise_error(error_type: Type[BaseException], message: str) → None[source]#

Logs an error message and raises an exception of the specified type.

Parameters:

error_type (Type[BaseException]) – The type of the exception to be raised.
message (str) – The error message to be logged and included in the exception.

Raises:

BaseException – An exception of the specified type with the given message.

probinet.utils.tools.output_adjacency(A: List, out_folder: PathLike, label: str)[source]#

Save the adjacency tensor to a file. Default format is space-separated .csv with L+2 columns: source_node target_node edge_l0 … edge_lL .

Parameters:

A (ndarray) – Adjacency tensor.
out_folder (str) – Output folder.
label (str) – Label of the evaluation file.

probinet.utils.tools.save_design_matrix(X: DataFrame, perc: float, folder: str = './', fname: str = 'X_')[source]#

Save the design matrix to file.

Parameters:

X (DataFrame) – Design matrix.
perc (float) – Fraction of match between communities and metadata.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the design matrix file.

probinet.utils.tools.sptensor_from_dense_array(X: ndarray) → COO[source]#

Create a sparse tensor from a dense array using sparse.COO.

Parameters:: X (ndarray) – Input data.
Returns:: Sparse tensor created from the dense array.
Return type:: COO

probinet.utils.tools.write_adjacency(G: List[MultiDiGraph], folder: str = './', fname: str = 'multilayer_network.csv', ego: str = 'source', alter: str = 'target')[source]#

Save the adjacency tensor to file.

Parameters:

G (list) – List of MultiDiGraph NetworkX objects.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the adjacency tensor file.
ego (str) – Name of the column to consider as source of the edge.
alter (str) – Name of the column to consider as target of the edge.

probinet.utils.tools.write_design_matrix(metadata: Dict[str, str], perc: float, folder: str = './', fname: str = 'X_', nodeID: str = 'Name', attr_name: str = 'Metadata') → DataFrame[source]#

Save the design matrix to file.

Parameters:

metadata (dict) – Dictionary where the keys are the node labels and the values are the metadata associated to them.
perc (float) – Fraction of match between communities and metadata.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the design matrix file.
nodeID (str) – Name of the column with the node labels.
attr_name (str) – Name of the column to consider as attribute.

Returns:

X – Design matrix

Return type:

DataFrame

probinet.utils.tools

Contents

probinet.utils.tools#