mmcontext.eval.label_similarity.LabelSimilarity#

class mmcontext.eval.label_similarity.LabelSimilarity#

Bases: BaseEvaluator

Compute similarity scores and ROC metrics for each unique label.

For each unique value v in labels:
  • Compute similarity scores between cell embeddings and label prototype

  • Generate ROC curves and calculate AUC scores

  • Create UMAP visualization colored by similarity scores

  • Plot similarity score distributions

Additionally computes:
  • Accuracy by finding the label with highest similarity for each cell

  • Baseline random accuracy based on label distribution

  • Standard deviation of AUC scores

Returns:

  • AUC score for each label

  • Mean AUC across all labels

  • Standard deviation of AUC scores

  • Accuracy score (ratio of correct assignments)

  • Random baseline accuracy

  • Accuracy over random baseline ratio

Produces:
  • ROC curve plots

  • UMAP visualizations

  • Similarity score histograms

compute(emb1, *, emb2, labels, label_key, label_kind, out_dir=None, skip_plotting=None, **kw)#

Compute similarity scores and ROC metrics for each unique label.

Return type:

EvalResult

plot(emb1, out_dir, *, emb2, labels, label_key, label_kind, save_format='png', figsize=(6, 6), dpi=300, font_size=12, font_style='normal', font_weight='normal', legend_fontsize=10, axis_label_size=12, axis_tick_size=10, frameon=False, skip_plotting=None, **kw)#

Generate plots for each unique label using cached similarity matrix if available.

Return type:

None

plot_only(out_dir, *, label_key, label_kind, save_format='png', figsize=(6, 6), dpi=300, font_size=12, font_style='normal', font_weight='normal', legend_fontsize=10, axis_label_size=12, axis_tick_size=10, frameon=False, skip_plotting=None, **kw)#

Generate plots using only cached data (no embeddings required).

This method allows you to regenerate plots without recomputing embeddings or similarity matrices. Useful for adjusting plot parameters or formats without rerunning the expensive computation.

Return type:

None

bins = 40#
cache_results = True#
name = 'LabelSimilarity'#
produces_plot = True#
requires_pair = True#
similarity = 'cosine'#
skip_plotting = False#
umap_min_dist = 0.1#
umap_n_neighbors = 15#
umap_random_state = 42#

Attributes table#

Methods table#

compute(emb1, *, emb2, labels, label_key, ...)

Compute similarity scores and ROC metrics for each unique label.

plot(emb1, out_dir, *, emb2, labels, ...[, ...])

Generate plots for each unique label using cached similarity matrix if available.

plot_only(out_dir, *, label_key, label_kind)

Generate plots using only cached data (no embeddings required).

Attributes#

LabelSimilarity.bins = 40#
LabelSimilarity.cache_results = True#
LabelSimilarity.name = 'LabelSimilarity'#
LabelSimilarity.produces_plot = True#
LabelSimilarity.requires_pair = True#
LabelSimilarity.similarity = 'cosine'#
LabelSimilarity.skip_plotting = False#
LabelSimilarity.umap_min_dist = 0.1#
LabelSimilarity.umap_n_neighbors = 15#
LabelSimilarity.umap_random_state = 42#

Methods#

LabelSimilarity.compute(emb1, *, emb2, labels, label_key, label_kind, out_dir=None, skip_plotting=None, **kw)#

Compute similarity scores and ROC metrics for each unique label.

Return type:

EvalResult

LabelSimilarity.plot(emb1, out_dir, *, emb2, labels, label_key, label_kind, save_format='png', figsize=(6, 6), dpi=300, font_size=12, font_style='normal', font_weight='normal', legend_fontsize=10, axis_label_size=12, axis_tick_size=10, frameon=False, skip_plotting=None, **kw)#

Generate plots for each unique label using cached similarity matrix if available.

Return type:

None

LabelSimilarity.plot_only(out_dir, *, label_key, label_kind, save_format='png', figsize=(6, 6), dpi=300, font_size=12, font_style='normal', font_weight='normal', legend_fontsize=10, axis_label_size=12, axis_tick_size=10, frameon=False, skip_plotting=None, **kw)#

Generate plots using only cached data (no embeddings required).

This method allows you to regenerate plots without recomputing embeddings or similarity matrices. Useful for adjusting plot parameters or formats without rerunning the expensive computation.

Return type:

None