mmcontext.eval.evaluate_scib.scibEvaluator#

class mmcontext.eval.evaluate_scib.scibEvaluator(adata, batch_key, label_key, embedding_key=None, reconstructed_keys=None, data_id='', n_top_genes=None, max_cells=None, logger=None, in_parallel=True)#

Bases: object

Evaluates embeddings and reconstructed features using specified metrics.

Parameters:
  • adata (AnnData) – AnnData Object containing raw data, embeddings, and reconstructed features.

  • batch_key (str) – Key in adata.obs containing batch information.

  • label_key (str) – Key in adata.obs containing bio label information (usually cell type)

  • embedding_key (Union[str, list[str], None] (default: None)) – Key(s) in adata.obsm containing embeddings to evaluate.

  • reconstructed_keys (Optional[list[str]] (default: None)) – List of keys in adata.layers containing reconstructed features.

  • data_id (str (default: "")) – Identifier for the dataset being evaluated.

  • n_top_genes (Optional[int] (default: None)) – Number of top genes to use for HVG selection. If None, all genes are used.

  • max_cells (Optional[int] (default: None)) – Maximum number of cells to use for evaluation. If None, all cells are used.

  • logger (Optional[Logger] (default: None)) – Logger object for logging messages.

compute_average_scores(bio_results, batch_results)#

Computes average bio-conservation and batch-integration scores.

Return type:

dict[str, Any]

compute_metrics(adata, adata_pre=None, adata_post=None, use_rep=None, cluster_key='cluster', type_='full', data_type='')#

Computes metrics on the specified data representation.

Return type:

dict[str, Any]

compute_metrics_in_parallel(adata, metrics)#

Compute metrics in parallel using a ThreadPoolExecutor.

evaluate()#

Computes metrics for raw data, embeddings, and reconstructed data.

Return type:

DataFrame

Methods table#

compute_average_scores(bio_results, ...)

Computes average bio-conservation and batch-integration scores.

compute_metrics(adata[, adata_pre, ...])

Computes metrics on the specified data representation.

compute_metrics_in_parallel(adata, metrics)

Compute metrics in parallel using a ThreadPoolExecutor.

evaluate()

Computes metrics for raw data, embeddings, and reconstructed data.

Methods#

scibEvaluator.compute_average_scores(bio_results, batch_results)#

Computes average bio-conservation and batch-integration scores.

Return type:

dict[str, Any]

scibEvaluator.compute_metrics(adata, adata_pre=None, adata_post=None, use_rep=None, cluster_key='cluster', type_='full', data_type='')#

Computes metrics on the specified data representation.

Return type:

dict[str, Any]

scibEvaluator.compute_metrics_in_parallel(adata, metrics)#

Compute metrics in parallel using a ThreadPoolExecutor.

scibEvaluator.evaluate()#

Computes metrics for raw data, embeddings, and reconstructed data.

Return type:

DataFrame