Plotting

Clustering Results

plot.py

Functions for visualizations.

Classes

Functions

`get_kde_outliers(df, x_col, y_col, *, min_x=0.0, levels=8, cut=0, top_n=None, scale='zscore', return_mask=False)`

KDE-based outlier detection, with optional ranking of top_n most extreme points.

Outlier definition: - Fit 2D KDE on (x_col, y_col) for eligible points - Find points outside outermost contour

Ranking (when top_n is not None): - Compute distance in optionally scaled (x, y) space. - scale="zscore": standardize by mean & std - scale="robust": standardize by median & IQR - scale="none": use raw (x, y)

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Must contain columns `x_col` and `y_col`.	required
`x_col`	`str`	Column names in df to use as x and y axes.	required
`y_col`	`str`	Column names in df to use as x and y axes.	required
`min_x`	`float or None`	Minimum x value for eligibility; points with x <= min_x are ignored. If None, all finite points are eligible.	`0.0`
`levels`	`int`	Number of KDE contour levels.	`8`
`cut`	`float`	KDE cut parameter (see seaborn.kdeplot).	`0`
`top_n`	`int or None`	If not None, return only the top_n most extreme outliers.	`None`
`scale`	`('none', 'zscore', 'robust')`	Scaling method for distance computation when ranking outliers.	`"none"`
`return_mask`	`bool`	If True, also return a boolean mask aligned to df.index indicating outlier status.	`False`

Returns:

Type	Description
`outliers_df`
`mask(optional)`

`in_outer_contour(x, y, paths)`

Return True if (x, y) lies inside ANY of the given matplotlib.path.Path objects.

`make_mode_grid(modes, *, n_cols=4, panel_size=(4.0, 2.5), dpi=150)`

Create a figure + gridspec layout for a list of modes, returning a dict {mode_name: ax}.

- Rows/cols computed from len(modes) and n_cols.
- panel_size gives (width, height) in inches per cell.

Example usage:

fig, ax_by_mode = make_mode_grid(modes, n_cols=4)
for mode in modes:
    plot_mode_P_profile(results, mode, ax=ax_by_mode[mode])
fig.tight_layout()

`make_mode_grid_by_K(results, *, modes=None, panel_size=(3.0, 2.5), dpi=150)`

Create a figure whose axes layout matches plot_Q_grid:

- Rows correspond to distinct K values (sorted).
- Within each row, columns correspond to modes with that K,
  in the order of `modes` (or results.modes if None).
- Returns a mapping {mode_name: ax} for the cells actually used.

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`	Must have Q_by_mode populated.	required
`modes`	`sequence of str`	If provided, only these modes are laid out (in this order). Otherwise use results.modes.	`None`
`panel_size`	`(width, height) in inches per panel.`		`(3.0, 2.5)`
`dpi`	`int`		`150`

Returns:

Name	Type	Description
`fig`	`Figure`
`axes_by_mode`	`dict`	Mapping mode_name -> Axes in the grid.

`plot_P_profile(P_sorted, LFC_sorted, ax=None, title='', lw=0.2)`

Plot sorted log2(P) along cluster index, coloring each gene's curve by the argmax of its LFC profile.

Parameters:

Name	Type	Description	Default
`P_sorted`	`ndarray`	(M, K) array of sorted P values per gene.	required
`LFC_sorted`	`ndarray`	(M, K) array of log fold change values per gene.	required
`ax`	`Axes`	If given, draw into this Axes.	`None`
`title`	`str`	Title for the plot.	`''`
`lw`	`float`	Line width for each gene's curve.	`0.2`

`plot_Q_grid(results, *, sort_by='max', cmap=None, figsize=None, n_ticks=8)`

Plot Q heatmaps for all modes in a grid, using results.mode_names_list as layout (rows by K, columns by mode within each K), with a single shared colorbar on the right.

`plot_Q_heatmap(results, mode_name, *, sort_by='max', cmap=None, colorbar=True, ax=None)`

Plot a heatmap of Q for a single mode.

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`	Container with aligned Q matrices.	required
`mode_name`	`str`	Mode to plot (must be a key in results.Q_by_mode).	required
`sort_by`	`('max', 'none')`	If "max", sort individuals by their max cluster membership. If "none", keep original row order.	`"max"`
`cmap`	`str or Colormap`	Colormap to use in imshow (e.g. "viridis", "plasma").	`None`
`colorbar`	`bool`	Whether to add a colorbar for this subplot.	`True`
`ax`	`matplotlib Axes`	If provided, draw into this axes; otherwise create a new Figure.	`None`

Returns:

Type	Description
`(fig, ax)`

`plot_cluster_bars(results, mode_name, colors=None, *, ax=None)`

Plot bar chart of total membership per cluster for a given mode.

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`		required
`mode_name`	`str`	Mode to plot.	required
`ax`	`Axes`	If given, draw into this Axes.	`None`

Returns:

Type	Description
`(fig, ax)`

`plot_cluster_in_grid(results, coords, mode_name, cluster_index, *, cmap=None, xlabel='Dim 1', ylabel='Dim 2', base_size=5.0, size_scale=20.0, figsize=None, colorbar=True, **scatter_kwargs)`

Plot membership for a single (mode, cluster) in the full grid layout where rows = modes and columns = clusters (0..K_max-1), using results.mode_sep_coord_dict to place that cluster in the correct cell.

All other cells are left empty / invisible.

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`		required
`coords`	`(array, shape(n_samples, 2))`	2D coordinates (UMAP, t-SNE, etc.).	required
`mode_name`	`str`	Mode name, must be present in results.mode_sep_coord_dict keys.	required
`cluster_index`	`int`	Cluster index (column in Q) for that mode.	required
`cmap`	`str or Colormap`	Colormap for membership intensity.	`None`
`xlabel`	`str`	Axis labels for the occupied cell.	`'Dim 1'`
`ylabel`	`str`	Axis labels for the occupied cell.	`'Dim 1'`
`base_size`	`float`	Base point size.	`5.0`
`size_scale`	`float`	Additional scale times membership value.	`20.0`
`figsize`	`tuple`	Figure size for the full grid.	`None`
`colorbar`	`bool`	Whether to draw a colorbar for the occupied cell.	`True`
`**scatter_kwargs`		Extra kwargs passed to `ax.scatter` for that cell.	`{}`

Returns:

Type	Description
`fig, axes : Figure and 2D axes array for the full grid.`

`plot_cluster_overlay(results, coords, *, cluster_colors=None, val_threshold=0.5, s=0.05, alpha=0.6, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)`

Overlay membership for all clusters within each mode, on a mode-grid:

rows = K values (in results.K_range order)
cols = modes within each K (using results.mode_coord_dict)

Each axis shows all clusters for that mode, with different base colors.

`plot_cluster_panels(results, coords, *, cluster_colors=None, val_threshold=0.0, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)`

Plot membership on 2D coords for each (mode, cluster) in a grid:

rows  = modes (in results.modes order)
cols  = cluster index 0..K_max-1

using results.mode_sep_coord_dict to place each (mode, cluster).

Each cell contains ONE cluster's membership (white→cluster_color).

`plot_cluster_scatter(coords, cluster_labels, *, cmap=None, colorbar=True, xlabel='Dim 1', ylabel='Dim 2', title=None, ax=None, max_colorbar_ticks=8, **scatter_kwargs)`

Scatter plot of 2D coordinates colored by (discrete) cluster labels.

`plot_feature_across_modes(df_pvs_modes, modes, selected_feature, custom_color_dict, *, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', xlim=None, ylim=None, figsize=(3.5, 4), dpi=150, legend_loc='upper right', legend_bbox_to_anchor=(0.0, 0.9), style_label=None, ax=None)`

For a focal gene, collect (weighted_Psum, sepLFC, sepCls) across modes and make the scatter-with-labels plot in one shot.

`plot_feature_bar(df, *, mode_name=None, metric='weighted_Psum', top_n=20, ax=None)`

Bar plot of top-N features by a given metric (e.g. weighted_Psum).

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Index = feature names, must contain column `metric`.	required
`mode_name`	`str`	For titling.	`None`
`metric`	`str`		`"weighted_Psum"`
`top_n`	`int`	Number of top features to show.	`20`
`ax`	`Axes`		`None`

Returns:

Type	Description
`(fig, ax)`

`plot_feature_kde(df, x_col, y_col, outlier_mask, *, mode_name=None, label_col=None, levels=8, cmap='viridis_r', bg_point_size=10.0, bg_alpha=0.1, outlier_point_size=30.0, outlier_alpha=0.85, x_pad_frac=0.02, y_pad_frac=0.05, min_x_pad=0.005, min_y_pad=1.0, adjust_text_kwargs=None, ax=None, dpi=150)`

Plot a scatter + filled KDE contour + labeled outlier points for a (x, y) feature pair.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Must contain columns `x_col` and `y_col`.	required
`x_col`	`str`	Column names in df to use as x and y axes.	required
`y_col`	`str`	Column names in df to use as x and y axes.	required
`outlier_mask`	`ndarray(bool)`	Boolean mask aligned to df.index indicating which points to label as outliers.	required
`mode_name`	`str`	For titling; purely cosmetic.	`None`
`label_col`	`(str, optional)`	Column name in df to use for outlier labels; if None, use df.index.	`None`
`levels`	`int`	Number of KDE contour levels.	`8`
`cmap`	`str`	Colormap for filled KDE contours.	`"viridis_r"`
`bg_point_size`	`float`	Size of background scatter points.	`10.0`
`bg_alpha`	`float`	Alpha for background scatter points.	`0.1`
`outlier_point_size`	`float`	Size of outlier scatter points.	`30.0`
`outlier_alpha`	`float`	Alpha for outlier scatter points.	`0.85`
`x_pad_frac`	`float`	Fractional padding to add to x and y axis limits.	`0.02, 0.05`
`y_pad_frac`	`float`	Fractional padding to add to x and y axis limits.	`0.02, 0.05`
`min_x_pad`	`float`	Minimum padding to add to x and y axis limits.	`0.005, 1.0`
`min_y_pad`	`float`	Minimum padding to add to x and y axis limits.	`0.005, 1.0`
`adjust_text_kwargs`	`dict`	Additional keyword arguments to pass to adjust_text.	`None`
`ax`	`Axes`	Matplotlib Axes to plot on; if None, a new figure and axes are created.	`None`
`dpi`	`int`	Resolution of the figure in dots per inch.	`150`

Returns:

Type	Description
`(fig, ax)`

`plot_feature_metrics(df_mode, mode_name, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', annot_mask=None, xmax=None, ymax=None, custom_color_dict=None)`

Scatter plot of feature metrics for a given mode, colored by separating class pattern.

Parameters:

Name	Type	Description	Default
`df_mode`	`DataFrame`	DataFrame containing feature metrics for the mode. Must include 'sepCls', x_col, and y_col.	required
`mode_name`	`str`	Name of the mode (for title).	required
`x_col`	`str`	Column name for x-axis metric (default is 'weighted_Psum').	`'weighted_Psum'`
`y_col`	`str`	Column name for y-axis metric (default is 'sepLFC').	`'sepLFC'`
`annot_mask`	`Series or None`	Boolean mask for annotating points (default is None).	`None`
`xmax`	`float or None`	Maximum x-axis limit (default is None, which auto-scales).	`None`
`ymax`	`float or None`	Maximum y-axis limit (default is None, which auto-scales).	`None`
`custom_color_dict`	`dict or None`	Custom color dictionary for 'sepType' categories (default is None).	`None`

Returns:

Type	Description
`None`

`plot_feature_scatter(df, *, mode_name=None, x='weighted_Psum', y='sepLFC', highlight=None, ax=None)`

Scatter plot of feature metrics, e.g. weighted_Psum vs sepLFC.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Must contain columns `x` and `y`, index = feature names.	required
`mode_name`	`str`	For titling; purely cosmetic.	`None`
`x`	`str`	Column names in df to use as axes.	`'weighted_Psum'`
`y`	`str`	Column names in df to use as axes.	`'weighted_Psum'`
`highlight`	`iterable of str`	Feature names (index values) to annotate.	`None`
`ax`	`Axes`		`None`

Returns:

Type	Description
`(fig, ax)`

`plot_feature_sepLFC_across_modes(res_model, df_pvs_modes, selected_feature, feature_names, colors, *, label_rank=True, dpi=150, ax=None)`

Horizontal bar plot of sepLFC for a focal gene across all modes.

Parameters:

Name	Type	Description	Default
`res_model`		ClumpplingResults-like object, with attributes: - modes: list of mode names - mode_K: dict[mode_name -> K] - P_aligned_by_mode: dict[mode_name -> P matrix] (not used, but available)	required
`df_pvs_modes`	`Mapping[str, 'pd.DataFrame']`	Dict mapping mode_name -> DataFrame with columns ['sepLFC', 'sepCls']. Row order must align with `feature_names`.	required
`selected_feature`	`str`	Feature name to plot.	required
`feature_names`	`Sequence[str]`	Sequence of all feature names; selected_feature must be in this list.	required
`colors`	`Sequence`	Sequence of colors indexed by cluster index (0-based).	required
`label_rank`	`bool`	If True, annotate each bar with the rank of the focal gene by sepLFC.	`True`
`dpi`	`int`	Figure DPI.	`150`
`ax`	`Optional[Axes]`	Optional existing Axes to plot into.	`None`

Returns:

Type	Description
`(fig, ax)`	Matplotlib Figure and Axes.

`plot_mode_P_profile(results, mode_name, ax=None, title=None, lw=0.2)`

For a single mode, compute the clustering profile and plot sorted log P.

`plot_sepLFC_dist(results, mode_name, *, lfc_threshold=10.0, ax=None, title=None)`

For a single mode, plot distribution of sepLFC by 'how many clusters are separated' (index of sorted cluster before the sepLFC gap).

`plot_sepLFC_labels(df_selected, modes, *, sepLFC_threshold=0.0, cmap='Reds', vmin=1e-05, vmax=None, y_max=40.0, hi_sepLFC_threshold=32.0, n_top_hi=15, n_top_lo=8, figsize_scale=0.95, dpi=150)`

For each mode in modes, plot:

a vertical axis of sepLFC values,
a rug plot of all genes with sepLFC > sepLFC_threshold,
labeled horizontal lines for the top sepLFC genes, colored by weighted_Psum,
all panels share a single horizontal colorbar (weighted_Psum) on top.

Parameters:

Name	Type	Description	Default
`df_selected`	`DataFrame`	Wide table with columns like: - weighted_Psum_{mode_name} - sepLFC_{mode_name} and index = gene IDs.	required
`modes`	`sequence of str`	Mode names used to derive the column suffixes.	required
`sepLFC_threshold`	`float`	Only genes with sepLFC > threshold are included per mode.	`0.0`
`cmap`	`str`	Colormap used to encode weighted_Psum.	`"Reds"`
`vmin`	`float`	For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold.	`1e-05`
`vmax`	`float`	For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold.	`1e-05`
`y_max`	`float`	ymax used for y-axis; also used in label positioning logic.	`40.0`
`hi_sepLFC_threshold`	`float`	If the top sepLFC in a mode exceeds this, up to `n_top_hi` labels per mode are shown; otherwise, up to `n_top_lo`.	`32.0`
`n_top_hi`	`int`	See above.	`15`
`n_top_lo`	`int`	See above.	`15`
`figsize_scale`	`float`	Scale factor for figure width: width = figsize_scale * len(modes).	`0.95`
`dpi`	`int`		`150`

Returns:

Name	Type	Description
`fig`	`Figure`
`axes_by_mode`	`dict`	Mapping mode_name -> Axes for that panel.

`plot_spatial_membership(Q, coords, ref_color, *, cls_idx=0, ax=None, val_threshold=0.0, vmin=0.0, vmax=1.0, s=1.0, alpha=1.0, title=None, keep_ticks=False)`

Plot a single colored scatter layer of 2D coordinates weighted by membership.

Parameters:

Name	Type	Description	Default
`Q`	`array - like`	Either an (n_cells, K) membership matrix, or an (n_cells,) vector.	required
`coords`	`array - like`	Either: - (n_cells, 2) array of [x, y] coordinates, or - tuple (x, y) of 1D arrays.	required
`ref_color`	`color spec`	Base color for the membership colormap (e.g. cmap(k), 'tab:blue', (r,g,b)).	required
`cls_idx`	`int`	If Q is (n_cells, K), which column to use as membership. Ignored if Q is 1D.	`0`
`ax`	`Axes`	Existing axes to draw on. If None, a new figure and axes are created.	`None`
`val_threshold`	`float`	Only plot points with membership > val_threshold.	`0.0`
`vmin`	`float`	Range of membership values for colormap normalization.	`0.0`
`vmax`	`float`	Range of membership values for colormap normalization.	`0.0`
`s`	`float`	Marker size.	`1.0`
`alpha`	`float`	Marker alpha.	`1.0`
`title`	`str`	Title for the axis (only set if not None).	`None`
`keep_ticks`	`bool`	If False (default), remove x/y ticks.	`False`

Returns:

Name	Type	Description
`ax`	`Axes`
`sp`	`PathCollection`	The scatter object.

Model Comparison

plot_comparison.py

Multi-model comparison visualizations.

Classes

Functions

`plot_avg_membership_barh(avg_cls_memberships, *, annot_col='annot', model_order=None, cluster_order=None, cluster_mode='auto', colors=None, figsize_per_cluster=(2.2, 4.0), dpi=150)`

Generalized horizontal grouped bar charts comparing cluster memberships across multiple models.

Parameters:

Name	Type	Description	Default
`avg_cls_memberships`	`Dict[str, DataFrame]`	Dict of {model_name: df}. Each df: rows = annotation groups, columns = clusters, plus an `annot_col` column. Example: {"rna": df_rna, "atac": df_atac, "multiome": df_mo}	required
`annot_col`	`str`	Column name for annotation groups.	`'annot'`
`model_order`	`Optional[Sequence[str]]`	Order of modalities in the legend/hue. If None, uses dict insertion order.	`None`
`cluster_order`	`Optional[Sequence[str]]`	Optional explicit order for clusters (subset will be used).	`None`
`cluster_mode`	`Literal['intersection', 'union', 'auto']`	How to determine clusters across models: - "intersection": use only clusters present in all models - "union": use all clusters across models - "auto": use intersection if non-empty else union	`'auto'`
`colors`	`Optional[Sequence[str]]`	Optional colors used for cluster title text (per cluster index).	`None`
`figsize_per_cluster`	`Tuple[float, float]`	(width_per_cluster, height) for 1 x K layout.	`(2.2, 4.0)`
`dpi`	`int`	Figure DPI.	`150`

Returns:

Type	Description
`(fig, axes)`

`plot_compmodels_Q_grid(comp_res, coords, models=None, models_plot_order=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)`

Plot membership on 2D coords (e.g. UMAP) for all modes in each model.

Layout: columns = models, rows = modes within each model.

Parameters:

Name	Type	Description	Default
`comp_res`	`CompModelsResults`	Loaded comparison results (from io.load_compmodels_results).	required
`coords`	`array - like`	(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.	required
`models`	`list of str`	Subset of models to include; defaults to all in comp_res.models.	`None`
`models_plot_order`	`list of str`	Order of columns; if None, uses `models`.	`None`
`val_threshold`	`float`	Membership threshold below which points are omitted for each cluster.	`0.5`
`s`	`float`	Marker size passed to plot_spatial_membership.	`0.05`
`colors`	`Sequence`	Sequence of colors used for clusters; default is tab20.	`None`
`figsize_scale`	`(float, float)`	Scale factors for figure size: (width_per_col, height_per_row).	`(2.5, 2.0)`
`suptitle`	`str`	Overall figure title.	`None`
`y_suptitle`	`float`	y position of suptitle.	`0.92`

`plot_compmodels_Q_selected(comp_res, coords, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)`

Plot membership on 2D coords (e.g. UMAP) for a selected set of modes.

Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row.

Parameters:

Name	Type	Description	Default
`comp_res`	`CompModelsResults`	Loaded comparison results (from io.load_compmodels_results). Must have attributes: - Q_by_mode : dict[full_mode_name -> ndarray (n_cells, K)] - mode_stats_by_model : dict[model_name -> DataFrame] with index 'Mode' (short_mode) and column 'Size'	required
`coords`	`array - like`	(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.	required
`model_mode_list`	`sequence of (model_name, short_mode)`	List of specific modes to plot, e.g. [("rna.seurat.louvain", "K20M1"), ("rna.seurat.louvain", "K20M2"), ("rna.scanpy.leiden", "K18M1")]	required
`n_rows`	`int`	Number of rows in the grid. If None and n_cols is None, uses 1 row.	`None`
`n_cols`	`int`	Number of columns in the grid. If None and n_rows is None, uses len(model_mode_list) columns (single row).	`None`
`val_threshold`	`float`	Membership threshold below which points are omitted for each cluster.	`0.5`
`s`	`float`	Marker size passed to plot_spatial_membership.	`0.05`
`colors`	`Sequence`	Sequence of colors used for clusters; default is tab20.	`None`
`figsize_scale`	`(float, float)`	Scale factors for figure size: (width_per_col, height_per_row).	`(2.5, 2.0)`
`suptitle`	`str`	Overall figure title.	`None`
`y_suptitle`	`float`	y position of suptitle.	`0.92`

Returns:

Type	Description
`(fig, axes_by_model_mode)`	fig : matplotlib.figure.Figure axes_by_model_mode : dict[(model_name, short_mode) -> Axes]

`plot_compmodels_alignment_by_model(comp_res, cmap=None, *, models=None, models_plot_order=None, row_by_K=False, wspace_padding=1.3, marker_size=200.0, alt_ls=False, ls_alt=('-', '--'), lw=1.0, connect_identity=False, adjacent_only=True, label_modes=True, figsize_scale=(0.3, 2), dpi=150, pair_mappings=None)`

Plot alignment between multiple models in a single graph.

Modes can be arranged in rows either by: - mode index within each model (row_by_K=False), or - grouped by K across models (row_by_K=True), so that modes with the same K value line up on the same row "band" across models.

When row_by_K=True: - For each K, determine the maximum number of modes with that K across all selected models. - Allocate that many rows for that K. - If a model has fewer modes for that K, the corresponding slots are left empty (no markers drawn).

Parameters:

Name	Type	Description	Default
`comp_res`	`CompModelsResults`	Must provide: - models - modes_by_model: dict[model -> list[str]] (short mode names) - full_mode_names: list[str] (e.g. "rna.seurat.K21M1") - all_modes_alignment: dict[full_mode_name -> list[int]] - alignment_across_all: dict["A-B" -> mapping list[int]] - K_max: int - get_Q(full_mode_name) or Q_by_mode[full_mode_name]	required
`cmap`		Either: - a matplotlib colormap (e.g. cm.get_cmap("tab20")) - a sequence of RGB tuples - None (defaults to tab20).	`None`
`models`	`sequence of str`	Subset of models to include. Defaults to all comp_res.models.	`None`
`models_plot_order`	`sequence of str`	Order of columns. Defaults to `models`.	`None`
`row_by_K`	`bool`	If True, modes are grouped by K across models; only modes with the same K appear in the same row band. If False, rows are mode index per model.	`False`
`wspace_padding`	`float`	Horizontal spacing factor between model columns, scaled by K_max.	`1.3`
`marker_size`	`float`	Size of the cluster markers.	`200.0`
`alt_ls`	`bool`	If True, use `ls_alt` to style edges for better visibility.	`False`
`ls_alt`	`sequence of str`	Line styles; ls_alt[0] used for non-identity edges, ls_alt[1] used for identity edges (if connect_identity=True).	`('-', '--')`
`lw`	`float`	Line width for edges.	`1.0`
`connect_identity`	`bool`	If True, also draw thin light-grey (or ls_alt[1]) lines for identity mappings (same aligned column index). If False, only draw non-identity.	`False`
`adjacent_only`	`bool`	If True, draw edges only between modes in adjacent model columns (to reduce clutter). If False, draw edges between any model pair.	`True`
`label_modes`	`bool`	If True, write mode labels near each block; column headers = models.	`True`
`figsize_scale`	`(float, float)`	Scale factors for figure size: (width_per_K, height_per_row). Width = n_models * K_max * width_per_K Height = n_rows * height_per_row	`(0.3, 2)`
`dpi`	`int`	Figure dpi.	`150`
`pair_mappings`	`dict`	Optional within-model pair mappings ("A-B" -> list[(col_idx_A, col_idx_B)]) to draw extra edges between successive modes of the same model.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`	The figure object.
`ax`	`Axes`	The axes.

`plot_compmodels_alignment_list(comp_res, cmap=None, marker_size=250, figsize=(6, 6))`

CompModels alignment pattern list using clumppling.plot_alignment_list, but with correct K-grouped ordering to avoid KeyError.

Requires comp_res to have: - full_mode_names - alignment_across_all - all_modes_alignment - get_Q(full_mode_name)

`plot_compmodels_diff_grid(comp_res, pair_mappings, coords, ref_mode, models_plot_order=None, val_threshold=0.5, diff_threshold=0.5, *, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)`

Plot difference in membership on 2D coords for all modes across models.

Use map_alt_to_ref to compute aligned differences.
For non-ref panels, plot a single overlaid diff scatter (per-cell).
Compute Δ = fraction(per_cell_diff > diff_threshold)

Parameters:

Name	Type	Description	Default
`comp_res`	`CompModelsResults`	Loaded comparison results (from io.load_compmodels_results).	required
`pair_mappings`	`dict`	Dict mapping "ref_mode-alt_mode" -> list of (ref_k, alt_k) tuples.	required
`coords`	`array - like`	(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode.	required
`ref_mode`	`str`	Full mode name (e.g. "model_shortmode") to use as reference.	required
`models_plot_order`	`list of str`	Order of models (columns); if None, uses all models in comp_res.	`None`
`val_threshold`	`float`	Membership threshold below which points are omitted for each cluster.	`0.5`
`diff_threshold`	`float`	Threshold for difference in membership to consider significant.	`0.5`
`colors`	`Sequence`	Sequence of colors used for clusters; default is tab20.	`None`
`s`	`float`	Marker size passed to plot_spatial_membership.	`0.05`
`alpha`	`float`	Alpha value for scatter points.	`0.6`
`figsize_scale`	`(float, float)`	Scale factors for figure size: (width_per_col, height_per_row).	`(2.5, 2.0)`
`suptitle`	`str`	Overall figure title.	`None`
`y_suptitle`	`float`	y position of suptitle.	`0.92`
`strict_pair_mapping`	`bool`	If True, raise an error if a required pair mapping is missing.	`True`

`plot_compmodels_diff_selected(comp_res, pair_mappings, coords, ref_mode, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, diff_threshold=0.5, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)`

Plot difference in membership on 2D coords for a selected set of modes. Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row. Parameters follow same pattern as 'plot_compmodels_diff_grid'.

`plot_discrete_colorbar(colors, K_max=None, *, labels=None, ax=None, figsize=None, dpi=150, facecolor='white')`

Plot a simple discrete colorbar-like strip for cluster colors.

Parameters:

Name	Type	Description	Default
`colors`	`Sequence[ColorSpec]`	A sequence of color specs. Can be: - list of RGB tuples (0-1 range) - hex strings - named matplotlib colors	required
`K_max`	`Optional[int]`	Number of clusters. If None, inferred as len(colors).	`None`
`labels`	`Optional[Sequence[str]]`	X tick labels. If None, defaults to ["Cls.1", ..., "Cls.K"].	`None`
`ax`	`Optional[Axes]`	Existing axes to draw on. If None, a new figure/axes is created.	`None`
`figsize`	`Optional[Tuple[float, float]]`	Figure size (only used if ax is None). Default scales with K.	`None`
`dpi`	`int`	Figure dpi (only used if ax is None).	`150`
`facecolor`	`str`	Figure/axes facecolor.	`'white'`

Returns:

Type	Description
`(fig, ax, im)`	im is the AxesImage returned by imshow.

`plot_feature_cluster_panels(results, coords, df_pvs_modes, selected_feature, *, modes=None, colors=None, plot_both_sides=False, val_threshold=0.0, w_scale=1.2, h_scale=1.4, dpi=150, suptitle=None)`

Plot spatial membership for separated clusters for a single focal gene across multiple modes.

Parameters:

Name	Type	Description	Default
`results`		Object with `Q_by_mode[mode] -> Q (n_cells, K)`.	required
`coords`	`array - like`	(n_cells, 2) or (x, y) tuple for spatial / UMAP coordinates.	required
`df_pvs_modes`	`dict[str, DataFrame]`	Mapping: mode_name -> DataFrame with index including `selected_feature` and a column 'sepCls' that stores (group0, group1) lists of 0-based cluster indices.	required
`selected_feature`	`str`	Feature name / index key used in df_pvs_modes[mode].loc[selected_feature].	required
`modes`	`sequence of str`	Subset / order of modes to plot. Defaults to all keys in df_pvs_modes that contain `selected_feature`.	`None`
`colors`		Either a sequence of colors indexable by cluster index, or a colormap. If None, defaults to tab20.	`None`
`plot_both_sides`	`bool`	If False: plot only the “fewer” side clusters across modes in a big [modes × all_sepCls] grid. If True: for each mode, left = sepCls[0], right = sepCls[1], separated by a vertical dashed line.	`False`
`val_threshold`	`float`	Membership threshold passed to plot_spatial_membership.	`0.0`
`w_scale`	`float`	Width/height scaling factors for figure size.	`1.2`
`h_scale`	`float`	Width/height scaling factors for figure size.	`1.2`
`dpi`	`int`	Figure DPI.	`150`
`suptitle`	`str or None`	Optional figure-level title.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`axes`	`dict[(mode_name, col_idx) -> Axes]`

`plot_feature_count(feature_counts, coords, *, feature_name='', log_transformed=True, vmax=6, vmin=None, size=5, cmap='RdYlBu_r', cbar_loc='bottom', cbar_label=None, ax=None)`

Plot a single gene's expression over 2D coordinates.

Parameters:

Name	Type	Description	Default
`feature_counts`	`array - like or sparse`	Per-cell values for one feature. Shape (n_cells,) or (n_cells, 1). If sparse, will be densified.	required
`coords`	`ndarray`	2D coordinates of shape (n_cells, 2).	required
`feature_name`	`str or None`	Title annotation.	`''`
`log_transformed`	`bool`	If False, apply log1p to feature_counts. If True, assume feature_counts already on log scale.	`True`
`vmax`	`float or None`	Color max. If None or 0, inferred from data.	`6`
`vmin`	`float or None`	Color min. If None, inferred by matplotlib.	`None`
`size`	`float`	Point size.	`5`
`cmap`	`str`	Colormap name.	`'RdYlBu_r'`
`cbar_loc`	`('bottom', 'top', 'left', 'right')`	Colorbar location.	`"bottom"`
`cbar_label`	`str or None`	Overrides default colorbar label.	`None`
`ax`	`Axes or None`	Existing axis to draw on.	`None`

Returns:

Type	Description
`Figure`

plot_group_diff(df_mode_group_diff, *, mode_sizes=None, annotation_group_sizes=None, ref_mode=None, show_top=True, show_left=True, annot=None, cmap='Reds', cbar_label='Fraction of different cells', top_ylabel='#cells in the group', left_xlabel='Mode size', x_label='Annotation groups', y_label='Modes', figsize=(10, 8), dpi=300, height_ratios=(1, 6), width_ratios=(1.5, 6), wspace=0.01, hspace=0.01, vmin=0.0, vmax=1.0, cbar_fraction=0.6, xtick_rotation=45, xtick_fontsize=8, ytick_fontsize=8, label_fontsize=7, border_width=0.5, border_color='lightgray', zero_label_eps=0.001, top_round_to=500, show_mode_size_labels=True, add_model_separators=True, model_sep_kwargs=None)

Plot a heatmap of mode-by-annotation-group differences. Optionally add marginal bar plots: - Top: annotation group sizes - Left: mode sizes

Parameters:

Name	Type	Description	Default
`df_mode_group_diff`	`DataFrame`	DataFrame with index=modes and columns=annotation groups.	required
`mode_sizes`	`Optional[Series]`	Series of mode sizes indexed by FULL mode names. Required if show_left=True.	`None`
`annotation_group_sizes`	`Optional[Series]`	Series of group sizes indexed by group names. Required if show_top=True.	`None`
`ref_mode`	`Optional[str]`	If provided, highlights this row label in red/bold.	`None`
`show_top`	`bool`	Toggle marginal bars.	`True`
`show_left`	`bool`	Toggle marginal bars.	`True`

Returns:

Type	Description
`(fig, axes)`	axes is a dict with keys: "heatmap", "top", "left"

`plot_mapping_alignment(*, pair_mapping, ref_K, alt_K, ref_mode, alt_mode, colors, figsize=(5, 2), dpi=150, node_size=150, node_edgecolor='black', node_linewidth=0.5, line_color='k', line_alpha=0.5, line_lw=1.0, ax=None, title=None)`

Plot a simple two-row pair-mapping alignment diagram.

Parameters:

Name	Type	Description	Default
`pair_mapping`	`Sequence[Tuple[int, int]]`	Sequence of (c_ref, c_alt) index pairs. Assumes ref row at y=1, alt row at y=0.	required
`ref_K`	`int`	Number of clusters in ref/alt spaces for x-limit.	required
`alt_K`	`int`	Number of clusters in ref/alt spaces for x-limit.	required
`ref_mode`	`str`	Labels for y-axis and title.	required
`alt_mode`	`str`	Labels for y-axis and title.	required
`colors`	`Union[Sequence, Mapping[int, str]]`	Colors indexed by cluster id. Can be a list/tuple or dict.	required
`ax`	`Optional[Axes]`	If provided, draws into existing axis.	`None`

Returns:

Type	Description
`(fig, ax)`

`plot_mapping_grid(*, ref_Q, alt_Q, pair_mappings, ref_mode, alt_mode, coords, colors, show=('alt',), dpi=150, s=0.5, figsize_scale=(2.0, 2.0), strict_pair_mapping=True, connect_lines=True, connect_color='k', connect_alpha=0.25, connect_lw=0.8)`

Plot reference/alt Qs, mapped alt, and per-column abs differences.

Row order (when included): 0) reference (the smaller-K space used for mapping) 1) mapped alt (larger-K mapped into smaller-K space) 2) original alt (the larger-K Q) 3) diff (abs(reference - mapped_alt))

The show argument controls which of: {"alt", "mapped_alt", "diff"} are added in addition to the reference row.

Default: show=("alt",) -> reference + original alt

`plot_membership_reordered(P, cmap, lbs, ax, title='', annot='')`

Plot membership with reordered cluster indices. P : np.ndarray, shape (n_samples, n_clusters) Membership matrix. cmap : list of colors Colors for each cluster. lbs : array-like, shape (n_samples,) Labels for each sample (used to group samples). ax : matplotlib.axes.Axes Axis to plot on. title : str Y-axis label. annot : str Annotation text (shown at top-right).

`plot_model_diff_heatmap(cross_model_overall_membership_diff, comp_res, models, *, figsize=(9, 8), dpi=150, cmap='Reds', decimals=2, vmin=0.0, vmax=1.0, linewidths=0.5, linecolor='white', ax=None, cbar=True, annot_size=8, tight_layout=True, show=False)`

Plot a cross-model overall membership difference heatmap.

Parameters:

Name	Type	Description	Default
`cross_model_overall_membership_diff`	`Union[Mapping[Tuple[str, str], float], Series]`	Dict-like or pd.Series with keys as (mode_name_model0, mode_name_model1) and values in [0, 1].	required
`comp_res`	`Any`	An object that contains: comp_res.full_mode_names_by_model[model_name] -> ordered list of full mode names. This ordering is used to reindex rows/cols (no lexical sorting issues like K10 vs K3).	required
`models`	`Sequence[str]`	Sequence of two model names in the same order used in the diff keys.	required
`figsize`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`dpi`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`cmap`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`annot`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`vmin`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`vmax`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`linewidths`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`linecolor`	`Tuple[float, float]`	Seaborn/Matplotlib styling options.	`(9, 8)`
`ax`	`Optional[Axes]`	If provided, plot into this axis. Otherwise create a new figure/axis.	`None`
`cbar`	`bool`	Whether to show colorbar.	`True`
`tight_layout`	`bool`	Whether to call plt.tight_layout().	`True`
`show`	`bool`	If True, calls plt.show().	`False`

Returns:

Type	Description
`(fig, ax, mat)`	The figure, axis, and the reindexed matrix used for the heatmap.

`plot_model_diff_summary(comp_res, mat_diffs, coords, models_plot_order=None, *, colors=None, figsize_scale=(2.5, 2.5), diff_cmap='RdPu', diff_vmin=0.0, diff_vmax=1.0, point_size=2.0, alpha=1.0, suptitle=None)`

For each model, plot: - Top row: major mode clustering (largest Size) on 2D coords, colored by discrete cluster labels using colors, analogous to plot_compmodels_Q_grid. - Bottom row: per-cell weighted average difference vs reference, aggregated across modes and weighted by mode size.

Parameters:

Name	Type	Description	Default
`comp_res`		Object containing compModels results. Must have: - modes_by_model : Dict[str, List[str]] (short mode names, e.g. "K20M1") - mode_stats_by_model : Dict[str, DataFrame] with columns ['Mode', 'Size'] - get_Q(full_mode_name) -> np.ndarray (n_cells x K)	required
`mat_diffs`	`dict`	Nested dict of diff matrices, typically from get_diff_matrices: `mat_diffs[model_name][short_mode] = diff_Q` where diff_Q has shape (n_cells, K_eff).	required
`coords`	`array - like or (x, y)`	2D coordinates per cell. Either: - array of shape (n_cells, 2), or - tuple/list (x, y) of 1D arrays.	required
`models_plot_order`	`sequence of str`	Order of models (columns). Defaults to list(mat_diffs.keys()).	`None`
`colors`	`sequence`	Sequence of discrete colors used for clusters in the TOP row, same semantics as in plot_compmodels_Q_grid. If None, defaults to tab20 colors.	`None`
`figsize_scale`	`(float, float)`	(width_per_model, height_per_row) used to derive overall figure size.	`(2.5, 2.5)`
`diff_cmap`	`str`	Colormap for the weighted difference panel (bottom row).	`"RdPu"`
`diff_vmin`	`float`	vmin/vmax for the difference colormap.	`0.0`
`diff_vmax`	`float`	vmin/vmax for the difference colormap.	`0.0`
`point_size`	`float`	Scatter point size.	`2.0`
`alpha`	`float`	Scatter alpha.	`1.0`
`suptitle`	`str or None`	Optional figure-level title.	`None`

Returns:

Type	Description
`(fig, axes)`	Matplotlib Figure and Axes array of shape (2, n_models).

`plot_spatial_structure_grid(results, coords, grps, *, modes=None, cmap=None, mode_labels=None, grp_seps=None, reorder_cls=True, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150)`

Optimized spatial + structure membership grid.

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`		required
`coords`	`(n_cells, 2)`		required
`grps`	`per-cell group labels for ordering the 1D trace`		required
`modes`	`optional subset of modes (defaults to results.modes)`		`None`
`cmap`	`Optional[Any]`	None -> auto tab20 colors list/tuple of colors length >= K_max matplotlib colormap callable	`None`
`grp_seps`	`Optional[Sequence[float]]`	optional separators for group boundaries on the structure plot. If None, computed from sorted grps.	`None`
`reorder_cls`	`bool`	if True, place clusters by aligned index.	`True`

`plot_structure_one_level(results, *, modes=None, cmap=None, grp_labels=(), mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, lb_suffix_sep=None, figsize=None, dpi=150, x_rot=0, x_ha='center')`

One-level group version of plot_structure_modes.

Pulls Q matrices from results.Q_by_mode.
Computes grp_info inside using get_uniq_lb_sep.
Works for any single-level labels (e.g., sample group / batch / cell type).
Optionally reorders samples by grp_labels via plot_membership_reordered.

Parameters:

Name	Type	Description	Default
`results`		ClumpplingResults-like object with attributes: - Q_by_mode : dict[mode_name -> (n_cells, K) array] - modes : sequence of mode names (if modes is None)	required
`modes`	`Optional[Sequence[str]]`	Which modes to plot. If None, uses `results.modes`.	`None`
`cmap`		Colormap list passed to plot_membership / plot_membership_reordered.	`None`
`grp_labels`	`Sequence[str]`	Group labels per sample (length n_cells).	`()`
`mode_labels`	`Optional[Sequence[str]]`	Labels for each mode row (defaults to `modes` if None or wrong length).	`None`
`reorder_clsind`	`bool`	If True, use plot_membership_reordered(Q, cmap, grp_labels, ...); otherwise use plot_membership(Q, cmap, ...).	`True`
`grp_seps_ymin`	`float`	How far separator lines extend below axis (in axis fraction).	`-0.2`
`lb_suffix_sep`	`Optional[str]`	Optional separator; if provided, only the suffix (after lb_suffix_sep) is used in x tick labels.	`None`
`figsize`	`Optional[Tuple[float, float]]`	Figure size (width, height). If None, chosen based on number of modes.	`None`
`dpi`	`int`	Figure DPI.	`150`

Returns:

Name	Type	Description
`fig`	`Figure`

`plot_structure_two_level(results, *, modes=None, cmap=None, grp_labels=(), supgrp_labels=None, mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, supgrp_seps_ymin=-0.6, lb_suffix_sep=None, figsize=None, dpi=150)`

Two-level group version of plot_structure_modes.

Pulls Q matrices from results.Q_by_mode.
Computes grp_info inside using get_uniq_lb_sep.
Works for any two-level labels (grp + optional supgrp).
Optionally reorders samples by (supgrp, grp).

Parameters:

Name	Type	Description	Default
`results`	`ClumpplingResults`	ClumpplingResults.	required
`modes`	`Optional[Sequence[str]]`	Which modes to plot. If None, uses `results.modes`.	`None`
`cmap`		Colormap list passed to plot_membership.	`None`
`grp_labels`	`Sequence[str]`	Lower-level labels per sample (length n_cells).	`()`
`supgrp_labels`	`Optional[Sequence[str]]`	Higher-level labels per sample (length n_cells), optional.	`None`
`grp_seps_ymin`	`float`	How far separator lines extend below axis.	`-0.2`
`supgrp_seps_ymin`	`float`	How far separator lines extend below axis.	`-0.2`

`strip_leading_zero(x, decimals=2)`

Format a float to a string with given decimals, stripping leading zero.

Gene Set Enrichment

plot_enrichment.py

Gene set enrichment visualizations.

Classes

Functions

`plot_LFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap)`

Fill a mode-grid figure with pairwise LFC z-score heatmaps.

Iterates over modes and calls plot_pairwise_heatmap with the LFC z-score matrix into the corresponding axes.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`; must contain a `"lfc_res"` key for each mode.	required
`ax_by_mode`	`dict`	Mapping `mode -> matplotlib.axes.Axes`.	required
`results`	`ClumpplingResults`	Used to look up K and generate cluster labels.	required
`cb_cmap`	`list`	Per-cluster color list (passed to `_mode_cluster_labels`).	required

`plot_LFC_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='coolwarm', center_zero=True, figsize=None, dpi=150, title=None, ax=None)`

Single heatmap of pairwise LFC enrichment across all modes.

Rows = modes, columns = cluster pairs (i < j) ordered lexicographically up to K_max. The same pair (e.g. C1 vs C2) occupies the same column for every mode, making it easy to compare enrichment of a given pair across modes. Cells are NaN (blank) for pairs that exceed a mode's K. Cells where q < sig_level are annotated with *.

Pairs are grouped visually by their first cluster index with light vertical separators; the secondary x-axis labels each group "Cv" (e.g. C1v).

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`: `{mode: {lfc_res, ...}}`.	required
`results`		Clumppling results object with `.modes` and `.mode_K` attributes.	required
`value`	`('z', 'obs')`	Which LFC quantity to colour: z-score or observed LFC.	`"z"`
`sig_level`	`float`	Significance threshold for * annotation (applied to q values).	`0.05`
`cmap`	`str`		`'coolwarm'`
`center_zero`	`bool`	Symmetric colour scale around 0.	`True`
`figsize`	`tuple`	Defaults to `(max(6, 0.55 * n_pairs), 0.45 * n_modes + 1.5)`.	`None`
`title`	`str`		`None`

`plot_P_enrichment_by_cluster(res_by_mode, results, cb_cmap, kind='pval', ncols=None, figsize_per_panel=(3.0, 2.8), dpi=150, sig_threshold=None)`

One subplot per cluster; each panel shows that cluster's P enrichment across all modes that contain it.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`; each value contains a `"p_res"` dict with `"p_emp"` and `"z"` arrays of length K for that mode.	required
`results`	`ClumpplingResults`	Used to look up K per mode (via `mode_K` and `modes`).	required
`cb_cmap`	`list`	Per-cluster colour list; cluster k gets `cb_cmap[k]`.	required
`kind`	`('pval', 'zscore')`	`"pval"` plots -log10(p_emp); `"zscore"` plots the z-score. Default `"pval"`.	`"pval"`
`ncols`	`int or None`	Columns in the subplot grid. Defaults to K_max.	`None`
`figsize_per_panel`	`(float, float)`	Width × height for each individual subplot.	`(3.0, 2.8)`
`dpi`	`int`		`150`
`sig_threshold`	`float or None`	Threshold for the reference line. For `"pval"`, a horizontal line is drawn at -log10(sig_threshold); defaults to `0.05`. For `"zscore"`, lines are drawn at ±sig_threshold; defaults to `2`. Pass `None` to use the kind-appropriate default.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`axes`	`np.ndarray of matplotlib.axes.Axes, shape (nrows, ncols)`

`plot_P_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='pval')`

Fill a mode-grid figure with per-cluster P enrichment bars.

Iterates over modes and calls either plot_P_enrichment_pval or plot_P_enrichment_zscore into the corresponding axes.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`; must contain a `"p_res"` key for each mode.	required
`ax_by_mode`	`dict`	Mapping `mode -> matplotlib.axes.Axes`, e.g. from `make_mode_grid_by_K`.	required
`results`	`ClumpplingResults`	Used to look up K and generate cluster labels via `_mode_cluster_labels`.	required
`cb_cmap`	`list`	Per-cluster color list passed to `_mode_cluster_labels`.	required
`kind`	`('pval', 'zscore')`	Whether to plot empirical p-values or z-scores. Default `"pval"`.	`"pval"`

`plot_P_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='OrRd', center_zero=False, figsize=None, dpi=150, title=None, ax=None)`

Single heatmap of per-cluster P enrichment across all modes.

Rows = modes, columns = clusters (C1 … CK_max). Each cell shows the P enrichment z-score (value="z") or empirical p-value (value="p") for that cluster in that mode. Cells exceeding a mode's K are shown as NaN. Cells where p_emp < sig_level are annotated with *.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`: `{mode: {p_res, ...}}`.	required
`results`		Clumppling results object with `.modes` and `.mode_K` attributes.	required
`value`	`('z', 'p')`	Which quantity to colour: z-score or empirical p-value.	`"z"`
`sig_level`	`float`	Significance threshold for * annotation (applied to p_emp).	`0.05`
`cmap`	`str`	Defaults to `"OrRd"` for z-score (one-sided enrichment); use `"coolwarm"` if you expect negative z-scores.	`'OrRd'`
`center_zero`	`bool`	Symmetric colour scale around 0. Default False (z-scores are typically positive for enrichment).	`False`
`figsize`	`tuple`		`None`
`title`	`str`		`None`

`plot_P_enrichment_pval(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)`

Bar chart of empirical p-values (-log10) per cluster.

`plot_P_enrichment_zscore(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)`

Bar chart of z-scores vs null per cluster.

`plot_gene_P_bars(P_gs, gene_set, cluster_labels, colors, top_n=None, gene_label_colors=None)`

Per-cluster waterfall bars showing each gene's loading within each cluster.

Genes are ranked by loading within each cluster and drawn as horizontal bars colored by cluster.

Parameters:

Name	Type	Description	Default
`P_gs`	`(ndarray, shape(n_gs, K))`	Gene-set rows of the aligned P matrix.	required
`gene_set`	`list of str`	Gene names corresponding to rows of `P_gs`.	required
`cluster_labels`	`list of str`	Labels for each cluster (columns of `P_gs`).	required
`colors`	`list of str`	One color per cluster used to fill the bars.	required
`top_n`	`int or None`	If set and `n_gs > top_n`, each cluster panel shows only the top-`top_n` genes by per-cluster P. Default `None` (show all).	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`axes`	`np.ndarray of matplotlib.axes.Axes`	Array of K axes, one per cluster.

`plot_gene_P_stacked(P_gs, gene_set, cluster_labels, gs_title='', log_scale=True, sort_by_sum=False, top_n=None, gene_colors=None, figsize=(6, 4), dpi=150)`

Stacked bar chart of per-gene P values across clusters.

Parameters:

Name	Type	Description	Default
`P_gs`	`ndarray`	Shape (n_gs, K). Gene-set rows of the P matrix.	required
`gene_set`	`list of str`	Gene names corresponding to rows of P_gs.	required
`cluster_labels`	`list of str`	Labels for each cluster (columns of P_gs).	required
`gs_title`	`str`	Title prefix for the plot. Default `""`.	`''`
`log_scale`	`bool`	If True, use a log y-axis. Default True.	`True`
`sort_by_sum`	`bool`	If True, sort clusters in descending order of total P sum. Default False.	`False`
`top_n`	`int or None`	If set and `n_gs > top_n`, restrict to the top-`top_n` genes by total P across clusters (before any cluster sorting). Default `None` (show all genes).	`None`
`gene_colors`	`list or None`	One color per gene (after any `top_n` subsetting). If `None` (default), colors are drawn from the `tab20` colormap.	`None`
`figsize`	`tuple of (float, float)`	Figure size in inches. Default `(6, 4)`.	`(6, 4)`
`dpi`	`int`	Figure resolution. Default 150.	`150`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax`	`Axes`

`plot_gene_lfc(df_gene_lfc, cluster_labels, sepL, sepH, gs_sepLFC, colors, figsize=(5, 3), dpi=150, ax=None, kind='mean', top_n=None, show_labels=None)`

Horizontal bar chart of per-gene LFC between high and low cluster groups.

Bars are colored on a diverging coolwarm scale centered on zero and saturated at ±gs_sepLFC.

Parameters:

Name	Type	Description	Default
`df_gene_lfc`	`DataFrame`	Output of `compute_gene_lfc` with columns `gene` and `LFC`.	required
`cluster_labels`	`list of str`	Not currently used; retained for API compatibility.	required
`sepL`	`list of int`	Cluster indices in the low group.	required
`sepH`	`list of int`	Cluster indices in the high group.	required
`gs_sepLFC`	`float`	Observed gene-set sepLFC; sets the colorbar saturation limits.	required
`colors`	`list of str`	Not currently used; retained for API compatibility.	required
`figsize`	`tuple of (float, float)`	Figure size in inches. Default `(5, 3)`.	`(5, 3)`
`dpi`	`int`	Figure resolution. Default 150.	`150`
`ax`	`Axes`	Draw into an existing axes if provided.	`None`
`kind`	`('extreme', 'mean')`	Determines the x-axis label: `"extreme"` labels min/max of group P; `"mean"` labels mean of group P. Default `"mean"`.	`"extreme"`
`top_n`	`int or None`	If set and the number of genes exceeds `top_n`, restrict to the top-`top_n` genes by absolute LFC. Default `None` (show all).	`None`
`show_labels`	`bool or None`	Whether to draw gene-name tick labels on the y-axis. If `None` (default), labels are shown when `top_n` is set or when `n_genes <= 30`; hidden otherwise.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax`	`Axes`

`plot_pairwise_heatmap(value_mat, sig_mat=None, labels=None, title=None, upper_only=True, cmap='coolwarm', center_zero=True, sig_level=0.05, figsize=(7, 6), dpi=150, ax=None)`

Heatmap of a KxK matrix with optional significance overlay.

`plot_pairwise_heatmap_bidir(upper_mat, lower_mat, upper_cmap='coolwarm', lower_cmap='PuOr', upper_sig=None, lower_sig=None, sig_level=0.05, labels=None, upper_label='', lower_label='', title=None, center_zero=True, figsize=(6, 5), dpi=150, ax=None)`

Heatmap with two KxK matrices split across upper and lower triangles.

Parameters:

Name	Type	Description	Default
`upper_mat`	`(K, K) arrays`	Values for the upper / lower triangle respectively.	required
`lower_mat`	`(K, K) arrays`	Values for the upper / lower triangle respectively.	required
`upper_cmap`	`colormap names`		`'coolwarm'`
`lower_cmap`	`colormap names`		`'coolwarm'`
`upper_sig`	`(K, K) arrays`	p-value (or any criterion) matrices; cells where value < sig_level are annotated with *.	`None`
`lower_sig`	`(K, K) arrays`	p-value (or any criterion) matrices; cells where value < sig_level are annotated with *.	`None`
`sig_level`	`float`		`0.05`
`labels`	`list of str`		`None`
`upper_label`	`colorbar axis labels`		`''`
`lower_label`	`colorbar axis labels`		`''`
`center_zero`	`bool`	If True, color scale is symmetric around 0.	`True`

`plot_per_cluster_P(P_gs, gene_set, cluster_labels, colors, null_mean_P=None, gs_title='', dpi=150)`

Super-figure with 1 + K subpanels.

Top row (1 panel spanning all columns): Scatter of mean gene-set P per cluster overlaid on boxplots of the null distribution (from sample_null_P), sorted by observed mean P descending. Each cluster is coloured accordingly. Y-axis is log scale. If null_mean_P is None, only the scatter is drawn. Bottom row (K panels): Per-cluster waterfall plots (gene loadings, cumulative rectangles).

Parameters:

Name	Type	Description	Default
`P_gs`	`(ndarray, shape(n_genes, K))`	Gene-set rows of the aligned P matrix.	required
`gene_set`	`list of str`	Gene names corresponding to rows of P_gs.	required
`cluster_labels`	`list of str`	Labels for each cluster (columns of P_gs).	required
`colors`	`list of str`	One colour per cluster.	required
`null_mean_P`	`(ndarray, shape(n_perm, K))`	Null mean loading vectors from `sample_null_P`. If provided, a boxplot of the null distribution is drawn behind the scatter.	`None`
`gs_title`	`str`	Optional title prefix for the top panel.	`''`
`dpi`	`int`	Figure resolution. Default 150.	`150`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax_top`	`Axes`
`axes_bottom`	`list of matplotlib.axes.Axes, length K`

`plot_sepLFC_distribution(df, gs_genes, title='', kind='auto', show_gs_textbox=True, gs_textbox_threshold=None, n_gs_textbox=10, show_non_gs_textbox=False, n_non_gs_textbox=10, textbox_fontsize=9, figsize=(6, 5), dpi=150, ax=None)`

Distribution of per-gene sepLFC, with gene-set genes highlighted.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Gene-indexed DataFrame with columns `sepLFC` and `rank_sepLFC` (output of `compute_all_feature_metrics` filtered to a sepCls). Should be pre-sorted descending by `sepLFC`.	required
`gs_genes`	`set or list of str`	Gene names belonging to the gene set of interest.	required
`title`	`str`	Axes title.	`''`
`kind`	`('auto', 'bar', 'hist')`	Chart type. `"auto"` (default) uses bar when `len(df) < 30`, hist otherwise.	`"auto"`
`show_gs_textbox`	`bool`	Show a text box listing gene-set genes with high sepLFC. Only rendered in hist mode. Default `True`.	`True`
`gs_textbox_threshold`	`float or None`	Minimum sepLFC for inclusion in the GS text box. When `None` (default) the top `n_gs_textbox` gene-set genes by rank are shown.	`None`
`n_gs_textbox`	`int`	Maximum number of gene-set genes in the GS text box (used when `gs_textbox_threshold` is `None`). Default 10.	`10`
`show_non_gs_textbox`	`bool`	Show a text box listing the top non-GS genes by sepLFC. Only rendered in hist mode. Default `False`.	`False`
`n_non_gs_textbox`	`int`	Number of top non-GS genes to list. Default 10.	`10`
`textbox_fontsize`	`int`	Font size for gene lines inside text boxes. The box title is rendered one point larger and bold. Default 9.	`9`
`figsize`	`tuple of (float, float)`	Figure size in inches. Default `(6, 5)`.	`(6, 5)`
`dpi`	`int`	Figure resolution. Default 150.	`150`
`ax`	`Axes`	Draw into an existing axes if provided.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax`	`Axes`

`plot_sepLFC_distribution_heatmap(res_by_mode, n_bins=60, cmap='Blues', figsize=(10, 0.45), dpi=150, kind='null_sep', pval_threshold=None, annotate_pval=False)`

Heatmap summary of the null sepLFC distribution across all modes.

Each row is one mode; colour encodes the density of the null distribution in each histogram bin. The observed gene-set sepLFC is overlaid as a red dot on each row, making enrichment strength and consistency across modes visible at a glance.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`; each value must contain a `"sep_res"` dict.	required
`n_bins`	`int`	Number of histogram bins shared across all modes. Default `60`.	`60`
`cmap`	`str or Colormap`	Colormap for the density heatmap. Default `"Blues"`.	`'Blues'`
`figsize`	`(float, float)`	`(width, height_per_row)`; total figure height is `height_per_row × n_modes`.	`(10, 0.45)`
`dpi`	`int`		`150`
`kind`	`('null_sep', 'null_fixed')`	Which null distribution to display. `"null_sep"` (default) uses the best-sepLFC null (`null_sepLFC`). `"null_fixed"` uses the fixed-cluster-group null (`null_lfc_at_sep`).	`"null_sep"`
`pval_threshold`	`float or None`	If set to a value `> 0`, the empirical one-sided p-value (fraction of null ≥ observed) is computed for each mode. When the p-value is below `pval_threshold` the observed point is shown as a star (`*`) instead of a dot. Set to `<= 0` (or leave as `None`) to always show a dot regardless of significance.	`None`
`annotate_pval`	`bool`	If `True`, the empirical p-value is printed in scientific notation to the right of each observed dot. The x-axis right limit is expanded automatically so the text stays within the frame. Requires `pval_threshold` to be set (or any positive value) to trigger p-value computation; if `pval_threshold` is `None` or `<= 0` and `annotate_pval` is `True`, p-values are still computed but the star logic is skipped. Default `False`.	`False`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax`	`Axes`

`plot_sepLFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='null_sep')`

Fill a mode-grid figure with sepLFC null-distribution plots.

Iterates over modes and calls either plot_sepLFC_null_sep or plot_sepLFC_null_fixed into the corresponding axes.

Parameters:

Name	Type	Description	Default
`res_by_mode`	`dict`	Output of `run_gs_enrichment`; must contain a `"sep_res"` key for each mode.	required
`ax_by_mode`	`dict`	Mapping `mode -> matplotlib.axes.Axes`.	required
`results`	`ClumpplingResults`	Used to look up K and generate cluster labels.	required
`cb_cmap`	`list`	Per-cluster color list (passed to `_mode_cluster_labels`).	required
`kind`	`('null_sep', 'null_fixed')`	Which null comparison to visualize. `"null_sep"` compares to each random set's own best sepLFC; `"null_fixed"` compares to the null evaluated at the observed bipartition. Default `"null_sep"`.	`"null_sep"`

`plot_sepLFC_null_fixed(sep_res, cluster_labels, title='', figsize=(5, 4), dpi=150, ax=None)`

Histogram of null LFC at the gene-set's fixed cluster groups, with gene-set value marked.

`plot_sepLFC_null_sep(sep_res, title='', figsize=(5, 4), dpi=150, ax=None)`

Histogram of null best-sepLFC per random set, with gene-set value marked.

`plot_seplfc_bipartite(gene_set, sepH, sepL, cluster_labels, df_mode, top_n_per_pair=5, gs_title='', lw_scale=10.0, min_lw=0.5, seg_gap=0.008, label_mode='auto', label_fontsize=7.0, arrow_fan=0.06, cmap='Spectral', vmin=None, vmax=None, colors=None, figsize=(8, 4), dpi=150, ax=None)`

Bipartite diagram where each gene's single segment sits on the one edge that corresponds to its own best cluster separation (sepLFC in df_mode).

Unlike the older bipartite approach that assigns every gene to every H-L edge based on P_gs, this function:

For each gene, finds the boundary pair (A, B) where A is the lowest-P cluster in the gene's upper group and B is the highest-P cluster in the gene's lower group (adjacent clusters across the max gap).
Keeps only genes whose boundary pair has A ∈ sepH and B ∈ sepL.
On each edge (A, B), selects the top top_n_per_pair genes by their df_mode.sepLFC value.
Draws each selected gene as a single segment on its assigned edge.

All arrows point from sepH toward sepL (LFC is always positive by construction).

Parameters:

Name	Type	Description	Default
`gene_set`	`list of str`	Gene names (must be a subset of `df_mode.index`).	required
`sepH`	`list of int`	Cluster indices in the high group (top row nodes).	required
`sepL`	`list of int`	Cluster indices in the low group (bottom row nodes).	required
`cluster_labels`	`list of str`	Labels for all K clusters.	required
`df_mode`	`DataFrame`	Feature-metrics DataFrame (from `compute_feature_metrics` / `compute_all_feature_metrics`) indexed by gene name, with columns `sepLFC` and `sepCls`.	required
`top_n_per_pair`	`int`	Maximum number of genes to show per (sepH, sepL) edge. Default 5.	`5`
`gs_title`	`str`	Title prefix.	`''`
`lw_scale`	`float`	Maximum line width (for the edge with the largest total sepLFC).	`10.0`
`min_lw`	`float`	Minimum line width.	`0.5`
`seg_gap`	`float`	Gap at each end of every gene segment (in t-space). Default 0.008.	`0.008`
`label_mode`	`str`	Controls gene-name labels on segments. One of: `"all"` – label every segment regardless of size. `"auto"` – label only segments whose fraction of the edge total exceeds `1 / top_n_per_pair` (i.e. roughly the equal-share threshold). Default. `"none"` – suppress all labels.	`'auto'`
`label_fontsize`	`float`	Font size for gene-name labels. Default 7.0.	`7.0`
`arrow_fan`	`float`	Half-width (in data units) of the fan applied at each node so arrowheads from different edges spread out. Default 0.06.	`0.06`
`cmap`	`str`	Matplotlib colormap name used to color segments by `sepLFC`. Default `"Spectral"`.	`'Spectral'`
`vmin`	`float or None`	Colormap range. `None` → inferred from the selected genes' sepLFC values.	`None`
`vmax`	`float or None`	Colormap range. `None` → inferred from the selected genes' sepLFC values.	`None`
`colors`	`list or None`	Per-cluster colors for node fill (indexed by cluster index).	`None`
`figsize`	`tuple of (float, float)`	Figure size in inches. Default `(8, 4)`.	`(8, 4)`
`dpi`	`int`	Figure resolution. Default 150.	`150`
`ax`	`Axes`	Draw into an existing axes if provided.	`None`

Returns:

Name	Type	Description
`fig`	`Figure`
`ax`	`Axes`

`plot_top_pairwise_df(df, value_col='z', sig_col='q', alpha=0.05, top_n=-1, labels=None, sort_by='q', figsize=(8, 6), dpi=150, ax=None)`

Dot plot of top cluster pairs sorted by significance or effect size.