probinet.models.mtcov#
Class definition of MTCOV, the generative algorithm that incorporates both the topology of interactions and node attributes to extract overlapping communities in directed and undirected multilayer networks [CPDB20].
Classes
|
Class definition of MTCOV, the generative algorithm that incorporates both the topology of interactions and node attributes to extract overlapping communities in directed and undirected multilayer networks. |
- class probinet.models.mtcov.MTCOV(err_max: float = 1e-07, num_realizations: int = 1, **kwargs: Any)[source]#
Class definition of MTCOV, the generative algorithm that incorporates both the topology of interactions and node attributes to extract overlapping communities in directed and undirected multilayer networks.
- additional_fields = ['egoX', 'cov_name', 'attr_name']#
- compute_likelihood()[source]#
Compute the pseudo log-likelihood of the data.
- Returns:
loglik – Pseudo log-likelihood value.
- Return type:
float
- fit(gdata: GraphData, batch_size: int | None = None, gamma: float = 0.5, K: int = 2, initialization: int = 0, undirected: bool = False, assortative: bool = False, out_inference: bool = True, out_folder: Path = PosixPath('outputs'), end_file: str | None = None, files: PathLike | None = None, rng: Generator | None = None, **_MTCOV__kwargs: Any) tuple[ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], ndarray[Any, dtype[float64]], float] [source]#
Perform community detection in multilayer networks considering both the topology of interactions and node attributes via EM updates. Save the membership matrices U and V, the affinity tensor W, and the beta matrix.
- Parameters:
gdata – Graph adjacency tensor.
K – Number of communities, by default 3.
initialization – Indicator for choosing how to initialize u, v, and w. If 0, they will be generated randomly; 1 means only the affinity matrix w will be uploaded from file; 2 implies the membership matrices u and v will be uploaded from file, and 3 all u, v, and w will be initialized through an input file, by default 0.
eta0 – Initial value for the reciprocity coefficient, by default None.
undirected – Flag to call the undirected network, by default False.
assortative – Flag to call the assortative network, by default True.
fix_eta – Flag to fix the eta parameter, by default False.
fix_communities – Flag to fix the community memberships, by default False.
fix_w – Flag to fix the affinity tensor, by default False.
use_approximation – Flag to use approximation in updates, by default False.
out_inference – Flag to evaluate inference results, by default True.
out_folder – Output folder for inference results, by default OUTPUT_FOLDER.
end_file – Suffix for the evaluation file, by default None.
files – Path to the file for initialization, by default None.
rng – Random number generator, by default None.
- Returns:
u_f – Out-going membership matrix.
v_f – In-coming membership matrix.
w_f – Affinity tensor.
eta_f – Pair interaction coefficient.
maxL – Maximum log-likelihood.
- get_params_to_load_data(args: Namespace) Dict[str, Any] [source]#
Get the parameters for the models.
- preprocess_data_for_fit(data: COO | ndarray, data_X: ndarray, batch_size: int | None = None) Tuple[COO | ndarray, ndarray, Sequence[ndarray], Sequence[ndarray], ndarray | None, Sequence[ndarray] | None, Sequence[ndarray] | None] [source]#
Preprocesses the input data for fitting the models.
This method handles the sparsity of the data, saves the indices of the non-zero entries, and optionally selects a subset of nodes for batch processing.
- Parameters:
data (GraphDataType) – The graph adjacency tensor to be preprocessed.
data_X (np.ndarray) – The one-hot encoding version of the design matrix to be preprocessed.
batch_size (Optional[int], default=None) – The size of the subset of nodes to compute the likelihood with. If None, the method will automatically determine the batch size based on the number of nodes.
- Returns:
preprocessed_data (GraphDataType) – The preprocessed graph adjacency tensor.
preprocessed_data_X (np.ndarray) – The preprocessed one-hot encoding version of the design matrix.
subs_nz (TupleArrays) – The indices of the non-zero entries in the data.
subs_X_nz (TupleArrays) – The indices of the non-zero entries in the design matrix.
subset_N (Optional[np.ndarray]) – The subset of nodes selected for batch processing. None if no subset is selected.
Subs (Optional[TupleArrays]) – The list of tuples representing the non-zero entries in the data. None if no subset is selected.
SubsX (Optional[TupleArrays]) – The list of tuples representing the non-zero entries in the design matrix. None if no subset is selected.