hrchy_cytocommunity.models.auto_k.HRCHYClusterAutoK

class hrchy_cytocommunity.models.auto_k.HRCHYClusterAutoK(dataset, cell_meta, coarse_range: tuple, fine_range: tuple, model_params: dict = None, max_runs=10, similarity_function=None)

identify the optimal clustering number for HRCHY-CytoCommunity model based on clustering stability

Parameters:
  • dataset (SpatialOmicsImageDataset) – The dataset object containing the data to be clustered.

  • cell_meta (pd.DataFrame) – Metadata for the cells, including spatial information.

  • coarse_range (tuple) – A tuple specifying the range of coarse clustering numbers to evaluate (min, max).

  • fine_range (tuple) – A tuple specifying the range of fine clustering numbers to evaluate (min, max). Must be >= coarse_range.

  • model_params (dict, optional) – A dictionary of parameters for the HRCHY-CytoCommunity model. If None, default parameters will be used.

  • max_runs (int, optional) – The number of clustering runs to perform for each combination of coarse and fine clustering numbers. Default is 10.

  • similarity_function (callable, optional) – A function to compute the similarity between two clustering results. Default is fowlkes_mallows_score.

__init__(dataset, cell_meta, coarse_range: tuple, fine_range: tuple, model_params: dict = None, max_runs=10, similarity_function=None)

Methods

__init__(dataset, cell_meta, coarse_range, ...)

convert_stability_2_mat()

find_best_model(save_dir, num_coarse, num_fine)

find the optimal(with minimum training loss) model of specified clustering number of coarse-grained TC and fine-grained CN

fit(save_dir)

search for the optimal clustering number by running clustering multiple times and calculating the average stability under different clustering resolutions

load(save_dir)

load the stability results and clustering labels from the specified directory.

save(save_dir)

Save the stability results and clustering labels to the specified directory.