probinet.utils.tools#
This module contains utility functions for data manipulation and file I/O.
Functions
|
Build the edgelist for a given layer of an adjacency tensor in DataFrame format. |
|
Verify if one object can be converted to integer object. |
|
Check if a matrix or a list of matrices is symmetric. |
|
Create the design matrix DataFrame from metadata. |
|
Round a number to a specified number of decimal places. |
|
Retrieves the values of specific entries in a dense tensor. |
|
Set the random seed and initialize the random number generator. |
|
Check whether the input tensor is sparse. |
|
Logs an error message and raises an exception of the specified type. |
|
Save the adjacency tensor to a file. |
|
Save the design matrix to file. |
Create a sparse tensor from a dense array using sparse.COO. |
|
|
Save the adjacency tensor to file. |
|
Save the design matrix to file. |
- probinet.utils.tools.build_edgelist(A: COO, layer: int) DataFrame [source]#
Build the edgelist for a given layer of an adjacency tensor in DataFrame format.
- Parameters:
A (coo_matrix) – Sparse matrix in COOrdinate format representing the adjacency tensor.
layer (int) – Index of the layer for which the edgelist is to be built.
- Returns:
DataFrame containing the edgelist for the specified layer with columns ‘source’, ‘target’, and ‘L<layer>’.
- Return type:
pd.DataFrame
- probinet.utils.tools.can_cast_to_int(string: int | float | str) bool [source]#
Verify if one object can be converted to integer object.
- Parameters:
string (int or float or str) – Name of the node.
- Returns:
bool – If True, the input can be converted to integer object.
- Return type:
bool
- probinet.utils.tools.check_symmetric(a: ndarray | List[ndarray], rtol: float = 1e-05, atol: float = 1e-08) bool [source]#
Check if a matrix or a list of matrices is symmetric.
- Parameters:
a (ndarray or list) – Input data.
rtol (float) – Relative convergence_tol.
atol (float) – Absolute convergence_tol.
- Return type:
True if the matrix is symmetric, False otherwise.
- probinet.utils.tools.create_design_matrix(metadata: Dict[str, str], nodeID: str = 'Name', attr_name: str = 'Metadata') DataFrame [source]#
Create the design matrix DataFrame from metadata.
- Parameters:
metadata (dict) – Dictionary where the keys are the node labels and the values are the metadata associated to them.
nodeID (str) – Name of the column with the node labels.
attr_name (str) – Name of the column to consider as attribute.
- Returns:
X – Design matrix
- Return type:
DataFrame
- probinet.utils.tools.flt(x: float, d: int = 3) float [source]#
Round a number to a specified number of decimal places.
- Parameters:
x (float) – Number to be rounded.
d (int) – Number of decimal places to round to.
- Returns:
The input number rounded to the specified number of decimal places.
- Return type:
float
- probinet.utils.tools.get_item_array_from_subs(A: ndarray, ref_subs: Sequence[ndarray]) ndarray [source]#
Retrieves the values of specific entries in a dense tensor. Output is a 1-d array with dimension = number of non-zero entries.
- Parameters:
A (np.ndarray) – The input tensor from which values are to be retrieved.
ref_subs (Tuple[np.ndarray]) – A tuple containing arrays of indices. Each array in the tuple corresponds to indices along one dimension of the tensor.
- Returns:
A 1-dimensional array containing the values of the tensor at the specified indices.
- Return type:
np.ndarray
- probinet.utils.tools.get_or_create_rng(rng: Generator | None = None) Generator [source]#
Set the random seed and initialize the random number generator.
- Parameters:
rng (Optional[np.random.Generator]) – Random number generator. If None, a new generator is created using the seed.
- Returns:
Initialized random number generator.
- Return type:
np.random.Generator
- probinet.utils.tools.is_sparse(X: ndarray) bool [source]#
Check whether the input tensor is sparse. It implements a heuristic definition of sparsity. A tensor is considered sparse if: given M = number of modes S = number of entries I = number of non-zero entries then N > M(I + 1)
- Parameters:
X (ndarray) – Input data.
- Returns:
Boolean flag
- Return type:
true if the input tensor is sparse, false otherwise.
- probinet.utils.tools.log_and_raise_error(error_type: Type[BaseException], message: str) None [source]#
Logs an error message and raises an exception of the specified type.
- Parameters:
error_type (Type[BaseException]) – The type of the exception to be raised.
message (str) – The error message to be logged and included in the exception.
- Raises:
BaseException – An exception of the specified type with the given message.
- probinet.utils.tools.output_adjacency(A: List, out_folder: PathLike, label: str)[source]#
Save the adjacency tensor to a file. Default format is space-separated .csv with L+2 columns: source_node target_node edge_l0 … edge_lL .
- Parameters:
A (ndarray) – Adjacency tensor.
out_folder (str) – Output folder.
label (str) – Label of the evaluation file.
- probinet.utils.tools.save_design_matrix(X: DataFrame, perc: float, folder: str = './', fname: str = 'X_')[source]#
Save the design matrix to file.
- Parameters:
X (DataFrame) – Design matrix.
perc (float) – Fraction of match between communities and metadata.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the design matrix file.
- probinet.utils.tools.sptensor_from_dense_array(X: ndarray) COO [source]#
Create a sparse tensor from a dense array using sparse.COO.
- Parameters:
X (ndarray) – Input data.
- Returns:
Sparse tensor created from the dense array.
- Return type:
COO
- probinet.utils.tools.write_adjacency(G: List[MultiDiGraph], folder: str = './', fname: str = 'multilayer_network.csv', ego: str = 'source', alter: str = 'target')[source]#
Save the adjacency tensor to file.
- Parameters:
G (list) – List of MultiDiGraph NetworkX objects.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the adjacency tensor file.
ego (str) – Name of the column to consider as source of the edge.
alter (str) – Name of the column to consider as target of the edge.
- probinet.utils.tools.write_design_matrix(metadata: Dict[str, str], perc: float, folder: str = './', fname: str = 'X_', nodeID: str = 'Name', attr_name: str = 'Metadata') DataFrame [source]#
Save the design matrix to file.
- Parameters:
metadata (dict) – Dictionary where the keys are the node labels and the values are the metadata associated to them.
perc (float) – Fraction of match between communities and metadata.
folder (str) – Path of the folder where to save the files.
fname (str) – Name of the design matrix file.
nodeID (str) – Name of the column with the node labels.
attr_name (str) – Name of the column to consider as attribute.
- Returns:
X – Design matrix
- Return type:
DataFrame