probinet.models.mtcov#

Class definition of MTCOV, the generative algorithm that incorporates both the topology of interactions and node attributes to extract overlapping communities in directed and undirected multilayer networks [CPDB20].

Classes

MTCOV([err_max, num_realizations])

class probinet.models.mtcov.MTCOV(err_max: float = 1e-07, num_realizations: int = 1, **kwargs: Any)[source]#

additional_fields = ['egoX', 'cov_name', 'attr_name']#

compute_likelihood()[source]#

Compute the pseudo log-likelihood of the data.

Returns:: loglik – Pseudo log-likelihood value.
Return type:: float

fit(gdata: GraphData, batch_size: int | None = None, gamma: float = 0.5, K: int = 2, initialization: int = 0, undirected: bool = False, assortative: bool = False, out_inference: bool = True, out_folder: Path = PosixPath('outputs'), end_file: str | None = None, files: PathLike | None = None, rng: Generator | None = None, **_MTCOV__kwargs: Any) → tuple[ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], float][source]#

Perform community detection in multilayer networks considering both the topology of interactions and node attributes via EM updates. Save the membership matrices U and V, the affinity tensor W, and the beta matrix.

Parameters:

gdata – Graph adjacency tensor.
batch_size – Size of the subset of nodes to compute the likelihood with.
gamma – Scaling parameter to control the contribution of the two terms in the likelihood, by default 0.5.
K – Number of communities, by default 2.
initialization – Indicator for choosing how to initialize u, v, and w. If 0, they will be generated randomly; 1 means only the affinity matrix w will be uploaded from file; 2 implies the membership matrices u and v will be uploaded from file, and 3 all u, v, and w will be initialized through an input file, by default 0.
undirected – Flag to call the undirected network, by default False.
assortative – Flag to call the assortative network, by default True.
out_inference – Flag to evaluate inference results, by default True.
out_folder – Output folder for inference results, by default OUTPUT_FOLDER.
end_file – Suffix for the evaluation file, by default None.
files – Path to the file for initialization, by default None.
rng – Random number generator, by default None.

Returns:

u_f – Out-going membership matrix.
v_f – In-coming membership matrix.
w_f – Affinity tensor.
eta_f – Pair interaction coefficient.
maxL – Maximum log-likelihood.

get_params_to_load_data(args: Namespace) → Dict[str, Any][source]#: Get the parameters for the models.

load_data(files: str, adj_name: str, **kwargs)[source]#: Load data from the input folder.

Preprocesses the input data for fitting the models.

This method handles the sparsity of the data, saves the indices of the non-zero entries, and optionally selects a subset of nodes for batch processing.

Parameters:

data (GraphDataType) – The graph adjacency tensor to be preprocessed.
data_X (np.ndarray) – The one-hot encoding version of the design matrix to be preprocessed.
batch_size (Optional[int], default=None) – The size of the subset of nodes to compute the likelihood with. If None, the method will automatically determine the batch size based on the number of nodes.

Returns:

preprocessed_data (GraphDataType) – The preprocessed graph adjacency tensor.
preprocessed_data_X (np.ndarray) – The preprocessed one-hot encoding version of the design matrix.
subs_nz (TupleArrays) – The indices of the non-zero entries in the data.
subs_X_nz (TupleArrays) – The indices of the non-zero entries in the design matrix.
subset_N (Optional[np.ndarray]) – The subset of nodes selected for batch processing. None if no subset is selected.
Subs (Optional[TupleArrays]) – The list of tuples representing the non-zero entries in the data. None if no subset is selected.
SubsX (Optional[TupleArrays]) – The list of tuples representing the non-zero entries in the design matrix. None if no subset is selected.

probinet.models.mtcov

Contents

probinet.models.mtcov#