Skip to content

Plotting

Clustering Results

plot.py

Functions for visualizations.

Classes

Functions

get_kde_outliers(df, x_col, y_col, *, min_x=0.0, levels=8, cut=0, top_n=None, scale='zscore', return_mask=False)

KDE-based outlier detection, with optional ranking of top_n most extreme points.

Outlier definition: - Fit 2D KDE on (x_col, y_col) for eligible points - Find points outside outermost contour

Ranking (when top_n is not None): - Compute distance in optionally scaled (x, y) space. - scale="zscore": standardize by mean & std - scale="robust": standardize by median & IQR - scale="none": use raw (x, y)

Parameters:

Name Type Description Default
df DataFrame

Must contain columns x_col and y_col.

required
x_col str

Column names in df to use as x and y axes.

required
y_col str

Column names in df to use as x and y axes.

required
min_x float or None

Minimum x value for eligibility; points with x <= min_x are ignored. If None, all finite points are eligible.

0.0
levels int

Number of KDE contour levels.

8
cut float

KDE cut parameter (see seaborn.kdeplot).

0
top_n int or None

If not None, return only the top_n most extreme outliers.

None
scale ('none', 'zscore', 'robust')

Scaling method for distance computation when ranking outliers.

"none"
return_mask bool

If True, also return a boolean mask aligned to df.index indicating outlier status.

False

Returns:

Type Description
outliers_df
mask(optional)

in_outer_contour(x, y, paths)

Return True if (x, y) lies inside ANY of the given matplotlib.path.Path objects.

make_mode_grid(modes, *, n_cols=4, panel_size=(4.0, 2.5), dpi=150)

Create a figure + gridspec layout for a list of modes, returning a dict {mode_name: ax}.

- Rows/cols computed from len(modes) and n_cols.
- panel_size gives (width, height) in inches per cell.

Example usage:

fig, ax_by_mode = make_mode_grid(modes, n_cols=4)
for mode in modes:
    plot_mode_P_profile(results, mode, ax=ax_by_mode[mode])
fig.tight_layout()

make_mode_grid_by_K(results, *, modes=None, panel_size=(3.0, 2.5), dpi=150)

Create a figure whose axes layout matches plot_Q_grid:

- Rows correspond to distinct K values (sorted).
- Within each row, columns correspond to modes with that K,
  in the order of `modes` (or results.modes if None).
- Returns a mapping {mode_name: ax} for the cells actually used.

Parameters:

Name Type Description Default
results ClumpplingResults

Must have Q_by_mode populated.

required
modes sequence of str

If provided, only these modes are laid out (in this order). Otherwise use results.modes.

None
panel_size (width, height) in inches per panel.
(3.0, 2.5)
dpi int
150

Returns:

Name Type Description
fig Figure
axes_by_mode dict

Mapping mode_name -> Axes in the grid.

plot_P_profile(P_sorted, LFC_sorted, ax=None, title='', lw=0.2)

Plot sorted log2(P) along cluster index, coloring each gene's curve by the argmax of its LFC profile.

Parameters:

Name Type Description Default
P_sorted ndarray

(M, K) array of sorted P values per gene.

required
LFC_sorted ndarray

(M, K) array of log fold change values per gene.

required
ax Axes

If given, draw into this Axes.

None
title str

Title for the plot.

''
lw float

Line width for each gene's curve.

0.2

plot_Q_grid(results, *, sort_by='max', cmap=None, figsize=None, n_ticks=8)

Plot Q heatmaps for all modes in a grid, using results.mode_names_list as layout (rows by K, columns by mode within each K), with a single shared colorbar on the right.

plot_Q_heatmap(results, mode_name, *, sort_by='max', cmap=None, colorbar=True, ax=None)

Plot a heatmap of Q for a single mode.

Parameters:

Name Type Description Default
results ClumpplingResults

Container with aligned Q matrices.

required
mode_name str

Mode to plot (must be a key in results.Q_by_mode).

required
sort_by ('max', 'none')

If "max", sort individuals by their max cluster membership. If "none", keep original row order.

"max"
cmap str or Colormap

Colormap to use in imshow (e.g. "viridis", "plasma").

None
colorbar bool

Whether to add a colorbar for this subplot.

True
ax matplotlib Axes

If provided, draw into this axes; otherwise create a new Figure.

None

Returns:

Type Description
(fig, ax)

plot_cluster_bars(results, mode_name, colors=None, *, ax=None)

Plot bar chart of total membership per cluster for a given mode.

Parameters:

Name Type Description Default
results ClumpplingResults
required
mode_name str

Mode to plot.

required
ax Axes

If given, draw into this Axes.

None

Returns:

Type Description
(fig, ax)

plot_cluster_in_grid(results, coords, mode_name, cluster_index, *, cmap=None, xlabel='Dim 1', ylabel='Dim 2', base_size=5.0, size_scale=20.0, figsize=None, colorbar=True, **scatter_kwargs)

Plot membership for a single (mode, cluster) in the full grid layout where rows = modes and columns = clusters (0..K_max-1), using results.mode_sep_coord_dict to place that cluster in the correct cell.

All other cells are left empty / invisible.

Parameters:

Name Type Description Default
results ClumpplingResults
required
coords (array, shape(n_samples, 2))

2D coordinates (UMAP, t-SNE, etc.).

required
mode_name str

Mode name, must be present in results.mode_sep_coord_dict keys.

required
cluster_index int

Cluster index (column in Q) for that mode.

required
cmap str or Colormap

Colormap for membership intensity.

None
xlabel str

Axis labels for the occupied cell.

'Dim 1'
ylabel str

Axis labels for the occupied cell.

'Dim 1'
base_size float

Base point size.

5.0
size_scale float

Additional scale times membership value.

20.0
figsize tuple

Figure size for the full grid.

None
colorbar bool

Whether to draw a colorbar for the occupied cell.

True
**scatter_kwargs

Extra kwargs passed to ax.scatter for that cell.

{}

Returns:

Type Description
fig, axes : Figure and 2D axes array for the full grid.

plot_cluster_overlay(results, coords, *, cluster_colors=None, val_threshold=0.5, s=0.05, alpha=0.6, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)

Overlay membership for all clusters within each mode, on a mode-grid:

rows = K values (in results.K_range order)
cols = modes within each K (using results.mode_coord_dict)

Each axis shows all clusters for that mode, with different base colors.

plot_cluster_panels(results, coords, *, cluster_colors=None, val_threshold=0.0, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)

Plot membership on 2D coords for each (mode, cluster) in a grid:

rows  = modes (in results.modes order)
cols  = cluster index 0..K_max-1

using results.mode_sep_coord_dict to place each (mode, cluster).

Each cell contains ONE cluster's membership (white→cluster_color).

plot_cluster_scatter(coords, cluster_labels, *, cmap=None, colorbar=True, xlabel='Dim 1', ylabel='Dim 2', title=None, ax=None, max_colorbar_ticks=8, **scatter_kwargs)

Scatter plot of 2D coordinates colored by (discrete) cluster labels.

plot_feature_across_modes(df_pvs_modes, modes, selected_feature, custom_color_dict, *, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', xlim=None, ylim=None, figsize=(3.5, 4), dpi=150, legend_loc='upper right', legend_bbox_to_anchor=(0.0, 0.9), style_label=None, ax=None)

For a focal gene, collect (weighted_Psum, sepLFC, sepCls) across modes and make the scatter-with-labels plot in one shot.

plot_feature_bar(df, *, mode_name=None, metric='weighted_Psum', top_n=20, ax=None)

Bar plot of top-N features by a given metric (e.g. weighted_Psum).

Parameters:

Name Type Description Default
df DataFrame

Index = feature names, must contain column metric.

required
mode_name str

For titling.

None
metric str
"weighted_Psum"
top_n int

Number of top features to show.

20
ax Axes
None

Returns:

Type Description
(fig, ax)

plot_feature_kde(df, x_col, y_col, outlier_mask, *, mode_name=None, label_col=None, levels=8, cmap='viridis_r', bg_point_size=10.0, bg_alpha=0.1, outlier_point_size=30.0, outlier_alpha=0.85, x_pad_frac=0.02, y_pad_frac=0.05, min_x_pad=0.005, min_y_pad=1.0, adjust_text_kwargs=None, ax=None, dpi=150)

Plot a scatter + filled KDE contour + labeled outlier points for a (x, y) feature pair.

Parameters:

Name Type Description Default
df DataFrame

Must contain columns x_col and y_col.

required
x_col str

Column names in df to use as x and y axes.

required
y_col str

Column names in df to use as x and y axes.

required
outlier_mask ndarray(bool)

Boolean mask aligned to df.index indicating which points to label as outliers.

required
mode_name str

For titling; purely cosmetic.

None
label_col (str, optional)

Column name in df to use for outlier labels; if None, use df.index.

None
levels int

Number of KDE contour levels.

8
cmap str

Colormap for filled KDE contours.

"viridis_r"
bg_point_size float

Size of background scatter points.

10.0
bg_alpha float

Alpha for background scatter points.

0.1
outlier_point_size float

Size of outlier scatter points.

30.0
outlier_alpha float

Alpha for outlier scatter points.

0.85
x_pad_frac float

Fractional padding to add to x and y axis limits.

0.02, 0.05
y_pad_frac float

Fractional padding to add to x and y axis limits.

0.02, 0.05
min_x_pad float

Minimum padding to add to x and y axis limits.

0.005, 1.0
min_y_pad float

Minimum padding to add to x and y axis limits.

0.005, 1.0
adjust_text_kwargs dict

Additional keyword arguments to pass to adjust_text.

None
ax Axes

Matplotlib Axes to plot on; if None, a new figure and axes are created.

None
dpi int

Resolution of the figure in dots per inch.

150

Returns:

Type Description
(fig, ax)

plot_feature_metrics(df_mode, mode_name, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', annot_mask=None, xmax=None, ymax=None, custom_color_dict=None)

Scatter plot of feature metrics for a given mode, colored by separating class pattern.

Parameters:

Name Type Description Default
df_mode DataFrame

DataFrame containing feature metrics for the mode. Must include 'sepCls', x_col, and y_col.

required
mode_name str

Name of the mode (for title).

required
x_col str

Column name for x-axis metric (default is 'weighted_Psum').

'weighted_Psum'
y_col str

Column name for y-axis metric (default is 'sepLFC').

'sepLFC'
annot_mask Series or None

Boolean mask for annotating points (default is None).

None
xmax float or None

Maximum x-axis limit (default is None, which auto-scales).

None
ymax float or None

Maximum y-axis limit (default is None, which auto-scales).

None
custom_color_dict dict or None

Custom color dictionary for 'sepType' categories (default is None).

None

Returns:

Type Description
None

plot_feature_scatter(df, *, mode_name=None, x='weighted_Psum', y='sepLFC', highlight=None, ax=None)

Scatter plot of feature metrics, e.g. weighted_Psum vs sepLFC.

Parameters:

Name Type Description Default
df DataFrame

Must contain columns x and y, index = feature names.

required
mode_name str

For titling; purely cosmetic.

None
x str

Column names in df to use as axes.

'weighted_Psum'
y str

Column names in df to use as axes.

'weighted_Psum'
highlight iterable of str

Feature names (index values) to annotate.

None
ax Axes
None

Returns:

Type Description
(fig, ax)

plot_feature_sepLFC_across_modes(res_model, df_pvs_modes, selected_feature, feature_names, colors, *, label_rank=True, dpi=150, ax=None)

Horizontal bar plot of sepLFC for a focal gene across all modes.

Parameters:

Name Type Description Default
res_model

ClumpplingResults-like object, with attributes: - modes: list of mode names - mode_K: dict[mode_name -> K] - P_aligned_by_mode: dict[mode_name -> P matrix] (not used, but available)

required
df_pvs_modes Mapping[str, 'pd.DataFrame']

Dict mapping mode_name -> DataFrame with columns ['sepLFC', 'sepCls']. Row order must align with feature_names.

required
selected_feature str

Feature name to plot.

required
feature_names Sequence[str]

Sequence of all feature names; selected_feature must be in this list.

required
colors Sequence

Sequence of colors indexed by cluster index (0-based).

required
label_rank bool

If True, annotate each bar with the rank of the focal gene by sepLFC.

True
dpi int

Figure DPI.

150
ax Optional[Axes]

Optional existing Axes to plot into.

None

Returns:

Type Description
(fig, ax)

Matplotlib Figure and Axes.

plot_mode_P_profile(results, mode_name, ax=None, title=None, lw=0.2)

For a single mode, compute the clustering profile and plot sorted log P.

plot_sepLFC_dist(results, mode_name, *, lfc_threshold=10.0, ax=None, title=None)

For a single mode, plot distribution of sepLFC by 'how many clusters are separated' (index of sorted cluster before the sepLFC gap).

plot_sepLFC_labels(df_selected, modes, *, sepLFC_threshold=0.0, cmap='Reds', vmin=1e-05, vmax=None, y_max=40.0, hi_sepLFC_threshold=32.0, n_top_hi=15, n_top_lo=8, figsize_scale=0.95, dpi=150)

For each mode in modes, plot:

  • a vertical axis of sepLFC values,
  • a rug plot of all genes with sepLFC > sepLFC_threshold,
  • labeled horizontal lines for the top sepLFC genes, colored by weighted_Psum,
  • all panels share a single horizontal colorbar (weighted_Psum) on top.

Parameters:

Name Type Description Default
df_selected DataFrame

Wide table with columns like: - weighted_Psum_{mode_name} - sepLFC_{mode_name} and index = gene IDs.

required
modes sequence of str

Mode names used to derive the column suffixes.

required
sepLFC_threshold float

Only genes with sepLFC > threshold are included per mode.

0.0
cmap str

Colormap used to encode weighted_Psum.

"Reds"
vmin float

For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold.

1e-05
vmax float

For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold.

1e-05
y_max float

ymax used for y-axis; also used in label positioning logic.

40.0
hi_sepLFC_threshold float

If the top sepLFC in a mode exceeds this, up to n_top_hi labels per mode are shown; otherwise, up to n_top_lo.

32.0
n_top_hi int

See above.

15
n_top_lo int

See above.

15
figsize_scale float

Scale factor for figure width: width = figsize_scale * len(modes).

0.95
dpi int
150

Returns:

Name Type Description
fig Figure
axes_by_mode dict

Mapping mode_name -> Axes for that panel.

plot_spatial_membership(Q, coords, ref_color, *, cls_idx=0, ax=None, val_threshold=0.0, vmin=0.0, vmax=1.0, s=1.0, alpha=1.0, title=None, keep_ticks=False)

Plot a single colored scatter layer of 2D coordinates weighted by membership.

Parameters:

Name Type Description Default
Q array - like

Either an (n_cells, K) membership matrix, or an (n_cells,) vector.

required
coords array - like

Either: - (n_cells, 2) array of [x, y] coordinates, or - tuple (x, y) of 1D arrays.

required
ref_color color spec

Base color for the membership colormap (e.g. cmap(k), 'tab:blue', (r,g,b)).

required
cls_idx int

If Q is (n_cells, K), which column to use as membership. Ignored if Q is 1D.

0
ax Axes

Existing axes to draw on. If None, a new figure and axes are created.

None
val_threshold float

Only plot points with membership > val_threshold.

0.0
vmin float

Range of membership values for colormap normalization.

0.0
vmax float

Range of membership values for colormap normalization.

0.0
s float

Marker size.

1.0
alpha float

Marker alpha.

1.0
title str

Title for the axis (only set if not None).

None
keep_ticks bool

If False (default), remove x/y ticks.

False

Returns:

Name Type Description
ax Axes
sp PathCollection

The scatter object.

Model Comparison

plot_comparison.py

Multi-model comparison visualizations.

Classes

Functions

plot_avg_membership_barh(avg_cls_memberships, *, annot_col='annot', model_order=None, cluster_order=None, cluster_mode='auto', colors=None, figsize_per_cluster=(2.2, 4.0), dpi=150)

Generalized horizontal grouped bar charts comparing cluster memberships across multiple models.

Parameters:

Name Type Description Default
avg_cls_memberships Dict[str, DataFrame]

Dict of {model_name: df}. Each df: rows = annotation groups, columns = clusters, plus an annot_col column. Example: {"rna": df_rna, "atac": df_atac, "multiome": df_mo}

required
annot_col str

Column name for annotation groups.

'annot'
model_order Optional[Sequence[str]]

Order of modalities in the legend/hue. If None, uses dict insertion order.

None
cluster_order Optional[Sequence[str]]

Optional explicit order for clusters (subset will be used).

None
cluster_mode Literal['intersection', 'union', 'auto']

How to determine clusters across models: - "intersection": use only clusters present in all models - "union": use all clusters across models - "auto": use intersection if non-empty else union

'auto'
colors Optional[Sequence[str]]

Optional colors used for cluster title text (per cluster index).

None
figsize_per_cluster Tuple[float, float]

(width_per_cluster, height) for 1 x K layout.

(2.2, 4.0)
dpi int

Figure DPI.

150

Returns:

Type Description
(fig, axes)

plot_compmodels_Q_grid(comp_res, coords, models=None, models_plot_order=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)

Plot membership on 2D coords (e.g. UMAP) for all modes in each model.

Layout: columns = models, rows = modes within each model.

Parameters:

Name Type Description Default
comp_res CompModelsResults

Loaded comparison results (from io.load_compmodels_results).

required
coords array - like

(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.

required
models list of str

Subset of models to include; defaults to all in comp_res.models.

None
models_plot_order list of str

Order of columns; if None, uses models.

None
val_threshold float

Membership threshold below which points are omitted for each cluster.

0.5
s float

Marker size passed to plot_spatial_membership.

0.05
colors Sequence

Sequence of colors used for clusters; default is tab20.

None
figsize_scale (float, float)

Scale factors for figure size: (width_per_col, height_per_row).

(2.5, 2.0)
suptitle str

Overall figure title.

None
y_suptitle float

y position of suptitle.

0.92

plot_compmodels_Q_selected(comp_res, coords, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)

Plot membership on 2D coords (e.g. UMAP) for a selected set of modes.

Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row.

Parameters:

Name Type Description Default
comp_res CompModelsResults

Loaded comparison results (from io.load_compmodels_results). Must have attributes: - Q_by_mode : dict[full_mode_name -> ndarray (n_cells, K)] - mode_stats_by_model : dict[model_name -> DataFrame] with index 'Mode' (short_mode) and column 'Size'

required
coords array - like

(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.

required
model_mode_list sequence of (model_name, short_mode)

List of specific modes to plot, e.g. [("rna.seurat.louvain", "K20M1"), ("rna.seurat.louvain", "K20M2"), ("rna.scanpy.leiden", "K18M1")]

required
n_rows int

Number of rows in the grid. If None and n_cols is None, uses 1 row.

None
n_cols int

Number of columns in the grid. If None and n_rows is None, uses len(model_mode_list) columns (single row).

None
val_threshold float

Membership threshold below which points are omitted for each cluster.

0.5
s float

Marker size passed to plot_spatial_membership.

0.05
colors Sequence

Sequence of colors used for clusters; default is tab20.

None
figsize_scale (float, float)

Scale factors for figure size: (width_per_col, height_per_row).

(2.5, 2.0)
suptitle str

Overall figure title.

None
y_suptitle float

y position of suptitle.

0.92

Returns:

Type Description
(fig, axes_by_model_mode)

fig : matplotlib.figure.Figure axes_by_model_mode : dict[(model_name, short_mode) -> Axes]

plot_compmodels_alignment_by_model(comp_res, cmap=None, *, models=None, models_plot_order=None, row_by_K=False, wspace_padding=1.3, marker_size=200.0, alt_ls=False, ls_alt=('-', '--'), lw=1.0, connect_identity=False, adjacent_only=True, label_modes=True, figsize_scale=(0.3, 2), dpi=150, pair_mappings=None)

Plot alignment between multiple models in a single graph.

Modes can be arranged in rows either by: - mode index within each model (row_by_K=False), or - grouped by K across models (row_by_K=True), so that modes with the same K value line up on the same row "band" across models.

When row_by_K=True: - For each K, determine the maximum number of modes with that K across all selected models. - Allocate that many rows for that K. - If a model has fewer modes for that K, the corresponding slots are left empty (no markers drawn).

Parameters:

Name Type Description Default
comp_res CompModelsResults

Must provide: - models - modes_by_model: dict[model -> list[str]] (short mode names) - full_mode_names: list[str] (e.g. "rna.seurat.K21M1") - all_modes_alignment: dict[full_mode_name -> list[int]] - alignment_across_all: dict["A-B" -> mapping list[int]] - K_max: int - get_Q(full_mode_name) or Q_by_mode[full_mode_name]

required
cmap

Either: - a matplotlib colormap (e.g. cm.get_cmap("tab20")) - a sequence of RGB tuples - None (defaults to tab20).

None
models sequence of str

Subset of models to include. Defaults to all comp_res.models.

None
models_plot_order sequence of str

Order of columns. Defaults to models.

None
row_by_K bool

If True, modes are grouped by K across models; only modes with the same K appear in the same row band. If False, rows are mode index per model.

False
wspace_padding float

Horizontal spacing factor between model columns, scaled by K_max.

1.3
marker_size float

Size of the cluster markers.

200.0
alt_ls bool

If True, use ls_alt to style edges for better visibility.

False
ls_alt sequence of str

Line styles; ls_alt[0] used for non-identity edges, ls_alt[1] used for identity edges (if connect_identity=True).

('-', '--')
lw float

Line width for edges.

1.0
connect_identity bool

If True, also draw thin light-grey (or ls_alt[1]) lines for identity mappings (same aligned column index). If False, only draw non-identity.

False
adjacent_only bool

If True, draw edges only between modes in adjacent model columns (to reduce clutter). If False, draw edges between any model pair.

True
label_modes bool

If True, write mode labels near each block; column headers = models.

True
figsize_scale (float, float)

Scale factors for figure size: (width_per_K, height_per_row). Width = n_models * K_max * width_per_K Height = n_rows * height_per_row

(0.3, 2)
dpi int

Figure dpi.

150
pair_mappings dict

Optional within-model pair mappings ("A-B" -> list[(col_idx_A, col_idx_B)]) to draw extra edges between successive modes of the same model.

None

Returns:

Name Type Description
fig Figure

The figure object.

ax Axes

The axes.

plot_compmodels_alignment_list(comp_res, cmap=None, marker_size=250, figsize=(6, 6))

CompModels alignment pattern list using clumppling.plot_alignment_list, but with correct K-grouped ordering to avoid KeyError.

Requires comp_res to have: - full_mode_names - alignment_across_all - all_modes_alignment - get_Q(full_mode_name)

plot_compmodels_diff_grid(comp_res, pair_mappings, coords, ref_mode, models_plot_order=None, val_threshold=0.5, diff_threshold=0.5, *, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)

Plot difference in membership on 2D coords for all modes across models.

  • Use map_alt_to_ref to compute aligned differences.
  • For non-ref panels, plot a single overlaid diff scatter (per-cell).
  • Compute Δ = fraction(per_cell_diff > diff_threshold)

Parameters:

Name Type Description Default
comp_res CompModelsResults

Loaded comparison results (from io.load_compmodels_results).

required
pair_mappings dict

Dict mapping "ref_mode-alt_mode" -> list of (ref_k, alt_k) tuples.

required
coords array - like

(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.

required
ref_mode str

Full mode name (e.g. "model_shortmode") to use as reference.

required
models_plot_order list of str

Order of models (columns); if None, uses all models in comp_res.

None
val_threshold float

Membership threshold below which points are omitted for each cluster.

0.5
diff_threshold float

Threshold for difference in membership to consider significant.

0.5
colors Sequence

Sequence of colors used for clusters; default is tab20.

None
s float

Marker size passed to plot_spatial_membership.

0.05
alpha float

Alpha value for scatter points.

0.6
figsize_scale (float, float)

Scale factors for figure size: (width_per_col, height_per_row).

(2.5, 2.0)
suptitle str

Overall figure title.

None
y_suptitle float

y position of suptitle.

0.92
strict_pair_mapping bool

If True, raise an error if a required pair mapping is missing.

True

plot_compmodels_diff_selected(comp_res, pair_mappings, coords, ref_mode, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, diff_threshold=0.5, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)

Plot difference in membership on 2D coords for a selected set of modes. Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row. Parameters follow same pattern as 'plot_compmodels_diff_grid'.

plot_discrete_colorbar(colors, K_max=None, *, labels=None, ax=None, figsize=None, dpi=150, facecolor='white')

Plot a simple discrete colorbar-like strip for cluster colors.

Parameters:

Name Type Description Default
colors Sequence[ColorSpec]

A sequence of color specs. Can be: - list of RGB tuples (0-1 range) - hex strings - named matplotlib colors

required
K_max Optional[int]

Number of clusters. If None, inferred as len(colors).

None
labels Optional[Sequence[str]]

X tick labels. If None, defaults to ["Cls.1", ..., "Cls.K"].

None
ax Optional[Axes]

Existing axes to draw on. If None, a new figure/axes is created.

None
figsize Optional[Tuple[float, float]]

Figure size (only used if ax is None). Default scales with K.

None
dpi int

Figure dpi (only used if ax is None).

150
facecolor str

Figure/axes facecolor.

'white'

Returns:

Type Description
(fig, ax, im)

im is the AxesImage returned by imshow.

plot_feature_cluster_panels(results, coords, df_pvs_modes, selected_feature, *, modes=None, colors=None, plot_both_sides=False, val_threshold=0.0, w_scale=1.2, h_scale=1.4, dpi=150, suptitle=None)

Plot spatial membership for separated clusters for a single focal gene across multiple modes.

Parameters:

Name Type Description Default
results

Object with Q_by_mode[mode] -> Q (n_cells, K).

required
coords array - like

(n_cells, 2) or (x, y) tuple for spatial / UMAP coordinates.

required
df_pvs_modes dict[str, DataFrame]

Mapping: mode_name -> DataFrame with index including selected_feature and a column 'sepCls' that stores (group0, group1) lists of 0-based cluster indices.

required
selected_feature str

Feature name / index key used in df_pvs_modes[mode].loc[selected_feature].

required
modes sequence of str

Subset / order of modes to plot. Defaults to all keys in df_pvs_modes that contain selected_feature.

None
colors

Either a sequence of colors indexable by cluster index, or a colormap. If None, defaults to tab20.

None
plot_both_sides bool

If False: plot only the “fewer” side clusters across modes in a big [modes × all_sepCls] grid. If True: for each mode, left = sepCls[0], right = sepCls[1], separated by a vertical dashed line.

False
val_threshold float

Membership threshold passed to plot_spatial_membership.

0.0
w_scale float

Width/height scaling factors for figure size.

1.2
h_scale float

Width/height scaling factors for figure size.

1.2
dpi int

Figure DPI.

150
suptitle str or None

Optional figure-level title.

None

Returns:

Name Type Description
fig Figure
axes dict[(mode_name, col_idx) -> Axes]

plot_feature_count(feature_counts, coords, *, feature_name='', log_transformed=True, vmax=6, vmin=None, size=5, cmap='RdYlBu_r', cbar_loc='bottom', cbar_label=None, ax=None)

Plot a single gene's expression over 2D coordinates.

Parameters:

Name Type Description Default
feature_counts array - like or sparse

Per-cell values for one feature. Shape (n_cells,) or (n_cells, 1). If sparse, will be densified.

required
coords ndarray

2D coordinates of shape (n_cells, 2).

required
feature_name str or None

Title annotation.

''
log_transformed bool

If False, apply log1p to feature_counts. If True, assume feature_counts already on log scale.

True
vmax float or None

Color max. If None or 0, inferred from data.

6
vmin float or None

Color min. If None, inferred by matplotlib.

None
size float

Point size.

5
cmap str

Colormap name.

'RdYlBu_r'
cbar_loc ('bottom', 'top', 'left', 'right')

Colorbar location.

"bottom"
cbar_label str or None

Overrides default colorbar label.

None
ax Axes or None

Existing axis to draw on.

None

Returns:

Type Description
Figure

plot_group_diff(df_mode_group_diff, *, mode_sizes=None, annotation_group_sizes=None, ref_mode=None, show_top=True, show_left=True, annot=None, cmap='Reds', cbar_label='Fraction of different cells', top_ylabel='#cells in the group', left_xlabel='Mode size', x_label='Annotation groups', y_label='Modes', figsize=(10, 8), dpi=300, height_ratios=(1, 6), width_ratios=(1.5, 6), wspace=0.01, hspace=0.01, vmin=0.0, vmax=1.0, cbar_fraction=0.6, xtick_rotation=45, xtick_fontsize=8, ytick_fontsize=8, label_fontsize=7, border_width=0.5, border_color='lightgray', zero_label_eps=0.001, top_round_to=500, show_mode_size_labels=True, add_model_separators=True, model_sep_kwargs=None)

Plot a heatmap of mode-by-annotation-group differences. Optionally add marginal bar plots: - Top: annotation group sizes - Left: mode sizes

Parameters:

Name Type Description Default
df_mode_group_diff DataFrame

DataFrame with index=modes and columns=annotation groups.

required
mode_sizes Optional[Series]

Series of mode sizes indexed by FULL mode names. Required if show_left=True.

None
annotation_group_sizes Optional[Series]

Series of group sizes indexed by group names. Required if show_top=True.

None
ref_mode Optional[str]

If provided, highlights this row label in red/bold.

None
show_top bool

Toggle marginal bars.

True
show_left bool

Toggle marginal bars.

True

Returns:

Type Description
(fig, axes)

axes is a dict with keys: "heatmap", "top", "left"

plot_mapping_alignment(*, pair_mapping, ref_K, alt_K, ref_mode, alt_mode, colors, figsize=(5, 2), dpi=150, node_size=150, node_edgecolor='black', node_linewidth=0.5, line_color='k', line_alpha=0.5, line_lw=1.0, ax=None, title=None)

Plot a simple two-row pair-mapping alignment diagram.

Parameters:

Name Type Description Default
pair_mapping Sequence[Tuple[int, int]]

Sequence of (c_ref, c_alt) index pairs. Assumes ref row at y=1, alt row at y=0.

required
ref_K int

Number of clusters in ref/alt spaces for x-limit.

required
alt_K int

Number of clusters in ref/alt spaces for x-limit.

required
ref_mode str

Labels for y-axis and title.

required
alt_mode str

Labels for y-axis and title.

required
colors Union[Sequence, Mapping[int, str]]

Colors indexed by cluster id. Can be a list/tuple or dict.

required
ax Optional[Axes]

If provided, draws into existing axis.

None

Returns:

Type Description
(fig, ax)

plot_mapping_grid(*, ref_Q, alt_Q, pair_mappings, ref_mode, alt_mode, coords, colors, show=('alt',), dpi=150, s=0.5, figsize_scale=(2.0, 2.0), strict_pair_mapping=True, connect_lines=True, connect_color='k', connect_alpha=0.25, connect_lw=0.8)

Plot reference/alt Qs, mapped alt, and per-column abs differences.

Row order (when included): 0) reference (the smaller-K space used for mapping) 1) mapped alt (larger-K mapped into smaller-K space) 2) original alt (the larger-K Q) 3) diff (abs(reference - mapped_alt))

The show argument controls which of: {"alt", "mapped_alt", "diff"} are added in addition to the reference row.

Default: show=("alt",) -> reference + original alt

plot_membership_reordered(P, cmap, lbs, ax, title='', annot='')

Plot membership with reordered cluster indices. P : np.ndarray, shape (n_samples, n_clusters) Membership matrix. cmap : list of colors Colors for each cluster. lbs : array-like, shape (n_samples,) Labels for each sample (used to group samples). ax : matplotlib.axes.Axes Axis to plot on. title : str Y-axis label. annot : str Annotation text (shown at top-right).

plot_model_diff_heatmap(cross_model_overall_membership_diff, comp_res, models, *, figsize=(9, 8), dpi=150, cmap='Reds', decimals=2, vmin=0.0, vmax=1.0, linewidths=0.5, linecolor='white', ax=None, cbar=True, annot_size=8, tight_layout=True, show=False)

Plot a cross-model overall membership difference heatmap.

Parameters:

Name Type Description Default
cross_model_overall_membership_diff Union[Mapping[Tuple[str, str], float], Series]

Dict-like or pd.Series with keys as (mode_name_model0, mode_name_model1) and values in [0, 1].

required
comp_res Any

An object that contains: comp_res.full_mode_names_by_model[model_name] -> ordered list of full mode names. This ordering is used to reindex rows/cols (no lexical sorting issues like K10 vs K3).

required
models Sequence[str]

Sequence of two model names in the same order used in the diff keys.

required
figsize Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
dpi Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
cmap Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
annot Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
vmin Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
vmax Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
linewidths Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
linecolor Tuple[float, float]

Seaborn/Matplotlib styling options.

(9, 8)
ax Optional[Axes]

If provided, plot into this axis. Otherwise create a new figure/axis.

None
cbar bool

Whether to show colorbar.

True
tight_layout bool

Whether to call plt.tight_layout().

True
show bool

If True, calls plt.show().

False

Returns:

Type Description
(fig, ax, mat)

The figure, axis, and the reindexed matrix used for the heatmap.

plot_model_diff_summary(comp_res, mat_diffs, coords, models_plot_order=None, *, colors=None, figsize_scale=(2.5, 2.5), diff_cmap='RdPu', diff_vmin=0.0, diff_vmax=1.0, point_size=2.0, alpha=1.0, suptitle=None)

For each model, plot: - Top row: major mode clustering (largest Size) on 2D coords, colored by discrete cluster labels using colors, analogous to plot_compmodels_Q_grid. - Bottom row: per-cell weighted average difference vs reference, aggregated across modes and weighted by mode size.

Parameters:

Name Type Description Default
comp_res

Object containing compModels results. Must have: - modes_by_model : Dict[str, List[str]] (short mode names, e.g. "K20M1") - mode_stats_by_model : Dict[str, DataFrame] with columns ['Mode', 'Size'] - get_Q(full_mode_name) -> np.ndarray (n_cells x K)

required
mat_diffs dict

Nested dict of diff matrices, typically from get_diff_matrices: mat_diffs[model_name][short_mode] = diff_Q where diff_Q has shape (n_cells, K_eff).

required
coords array - like or (x, y)

2D coordinates per cell. Either: - array of shape (n_cells, 2), or - tuple/list (x, y) of 1D arrays.

required
models_plot_order sequence of str

Order of models (columns). Defaults to list(mat_diffs.keys()).

None
colors sequence

Sequence of discrete colors used for clusters in the TOP row, same semantics as in plot_compmodels_Q_grid. If None, defaults to tab20 colors.

None
figsize_scale (float, float)

(width_per_model, height_per_row) used to derive overall figure size.

(2.5, 2.5)
diff_cmap str

Colormap for the weighted difference panel (bottom row).

"RdPu"
diff_vmin float

vmin/vmax for the difference colormap.

0.0
diff_vmax float

vmin/vmax for the difference colormap.

0.0
point_size float

Scatter point size.

2.0
alpha float

Scatter alpha.

1.0
suptitle str or None

Optional figure-level title.

None

Returns:

Type Description
(fig, axes)

Matplotlib Figure and Axes array of shape (2, n_models).

plot_spatial_structure_grid(results, coords, grps, *, modes=None, cmap=None, mode_labels=None, grp_seps=None, reorder_cls=True, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150)

Optimized spatial + structure membership grid.

Parameters:

Name Type Description Default
results ClumpplingResults
required
coords (n_cells, 2)
required
grps per-cell group labels for ordering the 1D trace
required
modes optional subset of modes (defaults to results.modes)
None
cmap Optional[Any]
  • None -> auto tab20 colors
  • list/tuple of colors length >= K_max
  • matplotlib colormap callable
None
grp_seps Optional[Sequence[float]]

optional separators for group boundaries on the structure plot. If None, computed from sorted grps.

None
reorder_cls bool

if True, place clusters by aligned index.

True

plot_structure_one_level(results, *, modes=None, cmap=None, grp_labels=(), mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, lb_suffix_sep=None, figsize=None, dpi=150, x_rot=0, x_ha='center')

One-level group version of plot_structure_modes.

  • Pulls Q matrices from results.Q_by_mode.
  • Computes grp_info inside using get_uniq_lb_sep.
  • Works for any single-level labels (e.g., sample group / batch / cell type).
  • Optionally reorders samples by grp_labels via plot_membership_reordered.

Parameters:

Name Type Description Default
results

ClumpplingResults-like object with attributes: - Q_by_mode : dict[mode_name -> (n_cells, K) array] - modes : sequence of mode names (if modes is None)

required
modes Optional[Sequence[str]]

Which modes to plot. If None, uses results.modes.

None
cmap

Colormap list passed to plot_membership / plot_membership_reordered.

None
grp_labels Sequence[str]

Group labels per sample (length n_cells).

()
mode_labels Optional[Sequence[str]]

Labels for each mode row (defaults to modes if None or wrong length).

None
reorder_clsind bool

If True, use plot_membership_reordered(Q, cmap, grp_labels, ...); otherwise use plot_membership(Q, cmap, ...).

True
grp_seps_ymin float

How far separator lines extend below axis (in axis fraction).

-0.2
lb_suffix_sep Optional[str]

Optional separator; if provided, only the suffix (after lb_suffix_sep) is used in x tick labels.

None
figsize Optional[Tuple[float, float]]

Figure size (width, height). If None, chosen based on number of modes.

None
dpi int

Figure DPI.

150

Returns:

Name Type Description
fig Figure

plot_structure_two_level(results, *, modes=None, cmap=None, grp_labels=(), supgrp_labels=None, mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, supgrp_seps_ymin=-0.6, lb_suffix_sep=None, figsize=None, dpi=150)

Two-level group version of plot_structure_modes.

  • Pulls Q matrices from results.Q_by_mode.
  • Computes grp_info inside using get_uniq_lb_sep.
  • Works for any two-level labels (grp + optional supgrp).
  • Optionally reorders samples by (supgrp, grp).

Parameters:

Name Type Description Default
results ClumpplingResults

ClumpplingResults.

required
modes Optional[Sequence[str]]

Which modes to plot. If None, uses results.modes.

None
cmap

Colormap list passed to plot_membership.

None
grp_labels Sequence[str]

Lower-level labels per sample (length n_cells).

()
supgrp_labels Optional[Sequence[str]]

Higher-level labels per sample (length n_cells), optional.

None
grp_seps_ymin float

How far separator lines extend below axis.

-0.2
supgrp_seps_ymin float

How far separator lines extend below axis.

-0.2

strip_leading_zero(x, decimals=2)

Format a float to a string with given decimals, stripping leading zero.

Gene Set Enrichment

plot_enrichment.py

Gene set enrichment visualizations.

Classes

Functions

plot_LFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap)

Fill a mode-grid figure with pairwise LFC z-score heatmaps.

Iterates over modes and calls plot_pairwise_heatmap with the LFC z-score matrix into the corresponding axes.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment; must contain a "lfc_res" key for each mode.

required
ax_by_mode dict

Mapping mode -> matplotlib.axes.Axes.

required
results ClumpplingResults

Used to look up K and generate cluster labels.

required
cb_cmap list

Per-cluster color list (passed to _mode_cluster_labels).

required

plot_LFC_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='coolwarm', center_zero=True, figsize=None, dpi=150, title=None, ax=None)

Single heatmap of pairwise LFC enrichment across all modes.

Rows = modes, columns = cluster pairs (i < j) ordered lexicographically up to K_max. The same pair (e.g. C1 vs C2) occupies the same column for every mode, making it easy to compare enrichment of a given pair across modes. Cells are NaN (blank) for pairs that exceed a mode's K. Cells where q < sig_level are annotated with *.

Pairs are grouped visually by their first cluster index with light vertical separators; the secondary x-axis labels each group "Cv" (e.g. C1v).

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment: {mode: {lfc_res, ...}}.

required
results

Clumppling results object with .modes and .mode_K attributes.

required
value ('z', 'obs')

Which LFC quantity to colour: z-score or observed LFC.

"z"
sig_level float

Significance threshold for * annotation (applied to q values).

0.05
cmap str
'coolwarm'
center_zero bool

Symmetric colour scale around 0.

True
figsize tuple

Defaults to (max(6, 0.55 * n_pairs), 0.45 * n_modes + 1.5).

None
title str
None

plot_P_enrichment_by_cluster(res_by_mode, results, cb_cmap, kind='pval', ncols=None, figsize_per_panel=(3.0, 2.8), dpi=150, sig_threshold=None)

One subplot per cluster; each panel shows that cluster's P enrichment across all modes that contain it.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment; each value contains a "p_res" dict with "p_emp" and "z" arrays of length K for that mode.

required
results ClumpplingResults

Used to look up K per mode (via mode_K and modes).

required
cb_cmap list

Per-cluster colour list; cluster k gets cb_cmap[k].

required
kind ('pval', 'zscore')

"pval" plots -log10(p_emp); "zscore" plots the z-score. Default "pval".

"pval"
ncols int or None

Columns in the subplot grid. Defaults to K_max.

None
figsize_per_panel (float, float)

Width × height for each individual subplot.

(3.0, 2.8)
dpi int
150
sig_threshold float or None

Threshold for the reference line. For "pval", a horizontal line is drawn at -log10(sig_threshold); defaults to 0.05. For "zscore", lines are drawn at ±sig_threshold; defaults to 2. Pass None to use the kind-appropriate default.

None

Returns:

Name Type Description
fig Figure
axes np.ndarray of matplotlib.axes.Axes, shape (nrows, ncols)

plot_P_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='pval')

Fill a mode-grid figure with per-cluster P enrichment bars.

Iterates over modes and calls either plot_P_enrichment_pval or plot_P_enrichment_zscore into the corresponding axes.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment; must contain a "p_res" key for each mode.

required
ax_by_mode dict

Mapping mode -> matplotlib.axes.Axes, e.g. from make_mode_grid_by_K.

required
results ClumpplingResults

Used to look up K and generate cluster labels via _mode_cluster_labels.

required
cb_cmap list

Per-cluster color list passed to _mode_cluster_labels.

required
kind ('pval', 'zscore')

Whether to plot empirical p-values or z-scores. Default "pval".

"pval"

plot_P_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='OrRd', center_zero=False, figsize=None, dpi=150, title=None, ax=None)

Single heatmap of per-cluster P enrichment across all modes.

Rows = modes, columns = clusters (C1 … CK_max). Each cell shows the P enrichment z-score (value="z") or empirical p-value (value="p") for that cluster in that mode. Cells exceeding a mode's K are shown as NaN. Cells where p_emp < sig_level are annotated with *.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment: {mode: {p_res, ...}}.

required
results

Clumppling results object with .modes and .mode_K attributes.

required
value ('z', 'p')

Which quantity to colour: z-score or empirical p-value.

"z"
sig_level float

Significance threshold for * annotation (applied to p_emp).

0.05
cmap str

Defaults to "OrRd" for z-score (one-sided enrichment); use "coolwarm" if you expect negative z-scores.

'OrRd'
center_zero bool

Symmetric colour scale around 0. Default False (z-scores are typically positive for enrichment).

False
figsize tuple
None
title str
None

plot_P_enrichment_pval(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)

Bar chart of empirical p-values (-log10) per cluster.

plot_P_enrichment_zscore(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)

Bar chart of z-scores vs null per cluster.

plot_gene_P_bars(P_gs, gene_set, cluster_labels, colors, top_n=None, gene_label_colors=None)

Per-cluster waterfall bars showing each gene's loading within each cluster.

Genes are ranked by loading within each cluster and drawn as horizontal bars colored by cluster.

Parameters:

Name Type Description Default
P_gs (ndarray, shape(n_gs, K))

Gene-set rows of the aligned P matrix.

required
gene_set list of str

Gene names corresponding to rows of P_gs.

required
cluster_labels list of str

Labels for each cluster (columns of P_gs).

required
colors list of str

One color per cluster used to fill the bars.

required
top_n int or None

If set and n_gs > top_n, each cluster panel shows only the top-top_n genes by per-cluster P. Default None (show all).

None

Returns:

Name Type Description
fig Figure
axes np.ndarray of matplotlib.axes.Axes

Array of K axes, one per cluster.

plot_gene_P_stacked(P_gs, gene_set, cluster_labels, gs_title='', log_scale=True, sort_by_sum=False, top_n=None, gene_colors=None, figsize=(6, 4), dpi=150)

Stacked bar chart of per-gene P values across clusters.

Parameters:

Name Type Description Default
P_gs ndarray

Shape (n_gs, K). Gene-set rows of the P matrix.

required
gene_set list of str

Gene names corresponding to rows of P_gs.

required
cluster_labels list of str

Labels for each cluster (columns of P_gs).

required
gs_title str

Title prefix for the plot. Default "".

''
log_scale bool

If True, use a log y-axis. Default True.

True
sort_by_sum bool

If True, sort clusters in descending order of total P sum. Default False.

False
top_n int or None

If set and n_gs > top_n, restrict to the top-top_n genes by total P across clusters (before any cluster sorting). Default None (show all genes).

None
gene_colors list or None

One color per gene (after any top_n subsetting). If None (default), colors are drawn from the tab20 colormap.

None
figsize tuple of (float, float)

Figure size in inches. Default (6, 4).

(6, 4)
dpi int

Figure resolution. Default 150.

150

Returns:

Name Type Description
fig Figure
ax Axes

plot_gene_lfc(df_gene_lfc, cluster_labels, sepL, sepH, gs_sepLFC, colors, figsize=(5, 3), dpi=150, ax=None, kind='mean', top_n=None, show_labels=None)

Horizontal bar chart of per-gene LFC between high and low cluster groups.

Bars are colored on a diverging coolwarm scale centered on zero and saturated at ±gs_sepLFC.

Parameters:

Name Type Description Default
df_gene_lfc DataFrame

Output of compute_gene_lfc with columns gene and LFC.

required
cluster_labels list of str

Not currently used; retained for API compatibility.

required
sepL list of int

Cluster indices in the low group.

required
sepH list of int

Cluster indices in the high group.

required
gs_sepLFC float

Observed gene-set sepLFC; sets the colorbar saturation limits.

required
colors list of str

Not currently used; retained for API compatibility.

required
figsize tuple of (float, float)

Figure size in inches. Default (5, 3).

(5, 3)
dpi int

Figure resolution. Default 150.

150
ax Axes

Draw into an existing axes if provided.

None
kind ('extreme', 'mean')

Determines the x-axis label: "extreme" labels min/max of group P; "mean" labels mean of group P. Default "mean".

"extreme"
top_n int or None

If set and the number of genes exceeds top_n, restrict to the top-top_n genes by absolute LFC. Default None (show all).

None
show_labels bool or None

Whether to draw gene-name tick labels on the y-axis. If None (default), labels are shown when top_n is set or when n_genes <= 30; hidden otherwise.

None

Returns:

Name Type Description
fig Figure
ax Axes

plot_pairwise_heatmap(value_mat, sig_mat=None, labels=None, title=None, upper_only=True, cmap='coolwarm', center_zero=True, sig_level=0.05, figsize=(7, 6), dpi=150, ax=None)

Heatmap of a KxK matrix with optional significance overlay.

plot_pairwise_heatmap_bidir(upper_mat, lower_mat, upper_cmap='coolwarm', lower_cmap='PuOr', upper_sig=None, lower_sig=None, sig_level=0.05, labels=None, upper_label='', lower_label='', title=None, center_zero=True, figsize=(6, 5), dpi=150, ax=None)

Heatmap with two KxK matrices split across upper and lower triangles.

Parameters:

Name Type Description Default
upper_mat (K, K) arrays

Values for the upper / lower triangle respectively.

required
lower_mat (K, K) arrays

Values for the upper / lower triangle respectively.

required
upper_cmap colormap names
'coolwarm'
lower_cmap colormap names
'coolwarm'
upper_sig (K, K) arrays

p-value (or any criterion) matrices; cells where value < sig_level are annotated with *.

None
lower_sig (K, K) arrays

p-value (or any criterion) matrices; cells where value < sig_level are annotated with *.

None
sig_level float
0.05
labels list of str
None
upper_label colorbar axis labels
''
lower_label colorbar axis labels
''
center_zero bool

If True, color scale is symmetric around 0.

True

plot_per_cluster_P(P_gs, gene_set, cluster_labels, colors, null_mean_P=None, gs_title='', dpi=150)

Super-figure with 1 + K subpanels.

Top row (1 panel spanning all columns): Scatter of mean gene-set P per cluster overlaid on boxplots of the null distribution (from sample_null_P), sorted by observed mean P descending. Each cluster is coloured accordingly. Y-axis is log scale. If null_mean_P is None, only the scatter is drawn. Bottom row (K panels): Per-cluster waterfall plots (gene loadings, cumulative rectangles).

Parameters:

Name Type Description Default
P_gs (ndarray, shape(n_genes, K))

Gene-set rows of the aligned P matrix.

required
gene_set list of str

Gene names corresponding to rows of P_gs.

required
cluster_labels list of str

Labels for each cluster (columns of P_gs).

required
colors list of str

One colour per cluster.

required
null_mean_P (ndarray, shape(n_perm, K))

Null mean loading vectors from sample_null_P. If provided, a boxplot of the null distribution is drawn behind the scatter.

None
gs_title str

Optional title prefix for the top panel.

''
dpi int

Figure resolution. Default 150.

150

Returns:

Name Type Description
fig Figure
ax_top Axes
axes_bottom list of matplotlib.axes.Axes, length K

plot_sepLFC_distribution(df, gs_genes, title='', kind='auto', show_gs_textbox=True, gs_textbox_threshold=None, n_gs_textbox=10, show_non_gs_textbox=False, n_non_gs_textbox=10, textbox_fontsize=9, figsize=(6, 5), dpi=150, ax=None)

Distribution of per-gene sepLFC, with gene-set genes highlighted.

Parameters:

Name Type Description Default
df DataFrame

Gene-indexed DataFrame with columns sepLFC and rank_sepLFC (output of compute_all_feature_metrics filtered to a sepCls). Should be pre-sorted descending by sepLFC.

required
gs_genes set or list of str

Gene names belonging to the gene set of interest.

required
title str

Axes title.

''
kind ('auto', 'bar', 'hist')

Chart type. "auto" (default) uses bar when len(df) < 30, hist otherwise.

"auto"
show_gs_textbox bool

Show a text box listing gene-set genes with high sepLFC. Only rendered in hist mode. Default True.

True
gs_textbox_threshold float or None

Minimum sepLFC for inclusion in the GS text box. When None (default) the top n_gs_textbox gene-set genes by rank are shown.

None
n_gs_textbox int

Maximum number of gene-set genes in the GS text box (used when gs_textbox_threshold is None). Default 10.

10
show_non_gs_textbox bool

Show a text box listing the top non-GS genes by sepLFC. Only rendered in hist mode. Default False.

False
n_non_gs_textbox int

Number of top non-GS genes to list. Default 10.

10
textbox_fontsize int

Font size for gene lines inside text boxes. The box title is rendered one point larger and bold. Default 9.

9
figsize tuple of (float, float)

Figure size in inches. Default (6, 5).

(6, 5)
dpi int

Figure resolution. Default 150.

150
ax Axes

Draw into an existing axes if provided.

None

Returns:

Name Type Description
fig Figure
ax Axes

plot_sepLFC_distribution_heatmap(res_by_mode, n_bins=60, cmap='Blues', figsize=(10, 0.45), dpi=150, kind='null_sep', pval_threshold=None, annotate_pval=False)

Heatmap summary of the null sepLFC distribution across all modes.

Each row is one mode; colour encodes the density of the null distribution in each histogram bin. The observed gene-set sepLFC is overlaid as a red dot on each row, making enrichment strength and consistency across modes visible at a glance.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment; each value must contain a "sep_res" dict.

required
n_bins int

Number of histogram bins shared across all modes. Default 60.

60
cmap str or Colormap

Colormap for the density heatmap. Default "Blues".

'Blues'
figsize (float, float)

(width, height_per_row); total figure height is height_per_row × n_modes.

(10, 0.45)
dpi int
150
kind ('null_sep', 'null_fixed')

Which null distribution to display. "null_sep" (default) uses the best-sepLFC null (null_sepLFC). "null_fixed" uses the fixed-cluster-group null (null_lfc_at_sep).

"null_sep"
pval_threshold float or None

If set to a value > 0, the empirical one-sided p-value (fraction of null ≥ observed) is computed for each mode. When the p-value is below pval_threshold the observed point is shown as a star (*) instead of a dot. Set to <= 0 (or leave as None) to always show a dot regardless of significance.

None
annotate_pval bool

If True, the empirical p-value is printed in scientific notation to the right of each observed dot. The x-axis right limit is expanded automatically so the text stays within the frame. Requires pval_threshold to be set (or any positive value) to trigger p-value computation; if pval_threshold is None or <= 0 and annotate_pval is True, p-values are still computed but the star logic is skipped. Default False.

False

Returns:

Name Type Description
fig Figure
ax Axes

plot_sepLFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='null_sep')

Fill a mode-grid figure with sepLFC null-distribution plots.

Iterates over modes and calls either plot_sepLFC_null_sep or plot_sepLFC_null_fixed into the corresponding axes.

Parameters:

Name Type Description Default
res_by_mode dict

Output of run_gs_enrichment; must contain a "sep_res" key for each mode.

required
ax_by_mode dict

Mapping mode -> matplotlib.axes.Axes.

required
results ClumpplingResults

Used to look up K and generate cluster labels.

required
cb_cmap list

Per-cluster color list (passed to _mode_cluster_labels).

required
kind ('null_sep', 'null_fixed')

Which null comparison to visualize. "null_sep" compares to each random set's own best sepLFC; "null_fixed" compares to the null evaluated at the observed bipartition. Default "null_sep".

"null_sep"

plot_sepLFC_null_fixed(sep_res, cluster_labels, title='', figsize=(5, 4), dpi=150, ax=None)

Histogram of null LFC at the gene-set's fixed cluster groups, with gene-set value marked.

plot_sepLFC_null_sep(sep_res, title='', figsize=(5, 4), dpi=150, ax=None)

Histogram of null best-sepLFC per random set, with gene-set value marked.

plot_seplfc_bipartite(gene_set, sepH, sepL, cluster_labels, df_mode, top_n_per_pair=5, gs_title='', lw_scale=10.0, min_lw=0.5, seg_gap=0.008, label_mode='auto', label_fontsize=7.0, arrow_fan=0.06, cmap='Spectral', vmin=None, vmax=None, colors=None, figsize=(8, 4), dpi=150, ax=None)

Bipartite diagram where each gene's single segment sits on the one edge that corresponds to its own best cluster separation (sepLFC in df_mode).

Unlike the older bipartite approach that assigns every gene to every H-L edge based on P_gs, this function:

  1. For each gene, finds the boundary pair (A, B) where A is the lowest-P cluster in the gene's upper group and B is the highest-P cluster in the gene's lower group (adjacent clusters across the max gap).
  2. Keeps only genes whose boundary pair has A ∈ sepH and B ∈ sepL.
  3. On each edge (A, B), selects the top top_n_per_pair genes by their df_mode.sepLFC value.
  4. Draws each selected gene as a single segment on its assigned edge.

All arrows point from sepH toward sepL (LFC is always positive by construction).

Parameters:

Name Type Description Default
gene_set list of str

Gene names (must be a subset of df_mode.index).

required
sepH list of int

Cluster indices in the high group (top row nodes).

required
sepL list of int

Cluster indices in the low group (bottom row nodes).

required
cluster_labels list of str

Labels for all K clusters.

required
df_mode DataFrame

Feature-metrics DataFrame (from compute_feature_metrics / compute_all_feature_metrics) indexed by gene name, with columns sepLFC and sepCls.

required
top_n_per_pair int

Maximum number of genes to show per (sepH, sepL) edge. Default 5.

5
gs_title str

Title prefix.

''
lw_scale float

Maximum line width (for the edge with the largest total sepLFC).

10.0
min_lw float

Minimum line width.

0.5
seg_gap float

Gap at each end of every gene segment (in t-space). Default 0.008.

0.008
label_mode str

Controls gene-name labels on segments. One of:

  • "all" – label every segment regardless of size.
  • "auto" – label only segments whose fraction of the edge total exceeds 1 / top_n_per_pair (i.e. roughly the equal-share threshold). Default.
  • "none" – suppress all labels.
'auto'
label_fontsize float

Font size for gene-name labels. Default 7.0.

7.0
arrow_fan float

Half-width (in data units) of the fan applied at each node so arrowheads from different edges spread out. Default 0.06.

0.06
cmap str

Matplotlib colormap name used to color segments by sepLFC. Default "Spectral".

'Spectral'
vmin float or None

Colormap range. None → inferred from the selected genes' sepLFC values.

None
vmax float or None

Colormap range. None → inferred from the selected genes' sepLFC values.

None
colors list or None

Per-cluster colors for node fill (indexed by cluster index).

None
figsize tuple of (float, float)

Figure size in inches. Default (8, 4).

(8, 4)
dpi int

Figure resolution. Default 150.

150
ax Axes

Draw into an existing axes if provided.

None

Returns:

Name Type Description
fig Figure
ax Axes

plot_top_pairwise_df(df, value_col='z', sig_col='q', alpha=0.05, top_n=-1, labels=None, sort_by='q', figsize=(8, 6), dpi=150, ax=None)

Dot plot of top cluster pairs sorted by significance or effect size.