Plotting
Clustering Results
plot.py
Functions for visualizations.
Classes
Functions
get_kde_outliers(df, x_col, y_col, *, min_x=0.0, levels=8, cut=0, top_n=None, scale='zscore', return_mask=False)
KDE-based outlier detection, with optional ranking of top_n most extreme points.
Outlier definition: - Fit 2D KDE on (x_col, y_col) for eligible points - Find points outside outermost contour
Ranking (when top_n is not None): - Compute distance in optionally scaled (x, y) space. - scale="zscore": standardize by mean & std - scale="robust": standardize by median & IQR - scale="none": use raw (x, y)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Must contain columns |
required |
x_col
|
str
|
Column names in df to use as x and y axes. |
required |
y_col
|
str
|
Column names in df to use as x and y axes. |
required |
min_x
|
float or None
|
Minimum x value for eligibility; points with x <= min_x are ignored. If None, all finite points are eligible. |
0.0
|
levels
|
int
|
Number of KDE contour levels. |
8
|
cut
|
float
|
KDE cut parameter (see seaborn.kdeplot). |
0
|
top_n
|
int or None
|
If not None, return only the top_n most extreme outliers. |
None
|
scale
|
('none', 'zscore', 'robust')
|
Scaling method for distance computation when ranking outliers. |
"none"
|
return_mask
|
bool
|
If True, also return a boolean mask aligned to df.index indicating outlier status. |
False
|
Returns:
| Type | Description |
|---|---|
outliers_df
|
|
mask(optional)
|
|
in_outer_contour(x, y, paths)
Return True if (x, y) lies inside ANY of the given matplotlib.path.Path objects.
make_mode_grid(modes, *, n_cols=4, panel_size=(4.0, 2.5), dpi=150)
Create a figure + gridspec layout for a list of modes, returning a dict {mode_name: ax}.
- Rows/cols computed from len(modes) and n_cols.
- panel_size gives (width, height) in inches per cell.
Example usage:
fig, ax_by_mode = make_mode_grid(modes, n_cols=4)
for mode in modes:
plot_mode_P_profile(results, mode, ax=ax_by_mode[mode])
fig.tight_layout()
make_mode_grid_by_K(results, *, modes=None, panel_size=(3.0, 2.5), dpi=150)
Create a figure whose axes layout matches plot_Q_grid:
- Rows correspond to distinct K values (sorted).
- Within each row, columns correspond to modes with that K,
in the order of `modes` (or results.modes if None).
- Returns a mapping {mode_name: ax} for the cells actually used.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
Must have Q_by_mode populated. |
required |
modes
|
sequence of str
|
If provided, only these modes are laid out (in this order). Otherwise use results.modes. |
None
|
panel_size
|
(width, height) in inches per panel.
|
|
(3.0, 2.5)
|
dpi
|
int
|
|
150
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
axes_by_mode |
dict
|
Mapping mode_name -> Axes in the grid. |
plot_P_profile(P_sorted, LFC_sorted, ax=None, title='', lw=0.2)
Plot sorted log2(P) along cluster index, coloring each gene's curve by the argmax of its LFC profile.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
P_sorted
|
ndarray
|
(M, K) array of sorted P values per gene. |
required |
LFC_sorted
|
ndarray
|
(M, K) array of log fold change values per gene. |
required |
ax
|
Axes
|
If given, draw into this Axes. |
None
|
title
|
str
|
Title for the plot. |
''
|
lw
|
float
|
Line width for each gene's curve. |
0.2
|
plot_Q_grid(results, *, sort_by='max', cmap=None, figsize=None, n_ticks=8)
Plot Q heatmaps for all modes in a grid, using results.mode_names_list as layout (rows by K, columns by mode within each K), with a single shared colorbar on the right.
plot_Q_heatmap(results, mode_name, *, sort_by='max', cmap=None, colorbar=True, ax=None)
Plot a heatmap of Q for a single mode.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
Container with aligned Q matrices. |
required |
mode_name
|
str
|
Mode to plot (must be a key in results.Q_by_mode). |
required |
sort_by
|
('max', 'none')
|
If "max", sort individuals by their max cluster membership. If "none", keep original row order. |
"max"
|
cmap
|
str or Colormap
|
Colormap to use in imshow (e.g. "viridis", "plasma"). |
None
|
colorbar
|
bool
|
Whether to add a colorbar for this subplot. |
True
|
ax
|
matplotlib Axes
|
If provided, draw into this axes; otherwise create a new Figure. |
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_cluster_bars(results, mode_name, colors=None, *, ax=None)
Plot bar chart of total membership per cluster for a given mode.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
|
required |
mode_name
|
str
|
Mode to plot. |
required |
ax
|
Axes
|
If given, draw into this Axes. |
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_cluster_in_grid(results, coords, mode_name, cluster_index, *, cmap=None, xlabel='Dim 1', ylabel='Dim 2', base_size=5.0, size_scale=20.0, figsize=None, colorbar=True, **scatter_kwargs)
Plot membership for a single (mode, cluster) in the full grid layout
where rows = modes and columns = clusters (0..K_max-1), using
results.mode_sep_coord_dict to place that cluster in the correct cell.
All other cells are left empty / invisible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
|
required |
coords
|
(array, shape(n_samples, 2))
|
2D coordinates (UMAP, t-SNE, etc.). |
required |
mode_name
|
str
|
Mode name, must be present in results.mode_sep_coord_dict keys. |
required |
cluster_index
|
int
|
Cluster index (column in Q) for that mode. |
required |
cmap
|
str or Colormap
|
Colormap for membership intensity. |
None
|
xlabel
|
str
|
Axis labels for the occupied cell. |
'Dim 1'
|
ylabel
|
str
|
Axis labels for the occupied cell. |
'Dim 1'
|
base_size
|
float
|
Base point size. |
5.0
|
size_scale
|
float
|
Additional scale times membership value. |
20.0
|
figsize
|
tuple
|
Figure size for the full grid. |
None
|
colorbar
|
bool
|
Whether to draw a colorbar for the occupied cell. |
True
|
**scatter_kwargs
|
Extra kwargs passed to |
{}
|
Returns:
| Type | Description |
|---|---|
fig, axes : Figure and 2D axes array for the full grid.
|
|
plot_cluster_overlay(results, coords, *, cluster_colors=None, val_threshold=0.5, s=0.05, alpha=0.6, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)
Overlay membership for all clusters within each mode, on a mode-grid:
rows = K values (in results.K_range order)
cols = modes within each K (using results.mode_coord_dict)
Each axis shows all clusters for that mode, with different base colors.
plot_cluster_panels(results, coords, *, cluster_colors=None, val_threshold=0.0, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150, suptitle=None, suptitle_kwargs=None)
Plot membership on 2D coords for each (mode, cluster) in a grid:
rows = modes (in results.modes order)
cols = cluster index 0..K_max-1
using results.mode_sep_coord_dict to place each (mode, cluster).
Each cell contains ONE cluster's membership (white→cluster_color).
plot_cluster_scatter(coords, cluster_labels, *, cmap=None, colorbar=True, xlabel='Dim 1', ylabel='Dim 2', title=None, ax=None, max_colorbar_ticks=8, **scatter_kwargs)
Scatter plot of 2D coordinates colored by (discrete) cluster labels.
plot_feature_across_modes(df_pvs_modes, modes, selected_feature, custom_color_dict, *, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', xlim=None, ylim=None, figsize=(3.5, 4), dpi=150, legend_loc='upper right', legend_bbox_to_anchor=(0.0, 0.9), style_label=None, ax=None)
For a focal gene, collect (weighted_Psum, sepLFC, sepCls) across modes and make the scatter-with-labels plot in one shot.
plot_feature_bar(df, *, mode_name=None, metric='weighted_Psum', top_n=20, ax=None)
Bar plot of top-N features by a given metric (e.g. weighted_Psum).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Index = feature names, must contain column |
required |
mode_name
|
str
|
For titling. |
None
|
metric
|
str
|
|
"weighted_Psum"
|
top_n
|
int
|
Number of top features to show. |
20
|
ax
|
Axes
|
|
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_feature_kde(df, x_col, y_col, outlier_mask, *, mode_name=None, label_col=None, levels=8, cmap='viridis_r', bg_point_size=10.0, bg_alpha=0.1, outlier_point_size=30.0, outlier_alpha=0.85, x_pad_frac=0.02, y_pad_frac=0.05, min_x_pad=0.005, min_y_pad=1.0, adjust_text_kwargs=None, ax=None, dpi=150)
Plot a scatter + filled KDE contour + labeled outlier points for a (x, y) feature pair.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Must contain columns |
required |
x_col
|
str
|
Column names in df to use as x and y axes. |
required |
y_col
|
str
|
Column names in df to use as x and y axes. |
required |
outlier_mask
|
ndarray(bool)
|
Boolean mask aligned to df.index indicating which points to label as outliers. |
required |
mode_name
|
str
|
For titling; purely cosmetic. |
None
|
label_col
|
(str, optional)
|
Column name in df to use for outlier labels; if None, use df.index. |
None
|
levels
|
int
|
Number of KDE contour levels. |
8
|
cmap
|
str
|
Colormap for filled KDE contours. |
"viridis_r"
|
bg_point_size
|
float
|
Size of background scatter points. |
10.0
|
bg_alpha
|
float
|
Alpha for background scatter points. |
0.1
|
outlier_point_size
|
float
|
Size of outlier scatter points. |
30.0
|
outlier_alpha
|
float
|
Alpha for outlier scatter points. |
0.85
|
x_pad_frac
|
float
|
Fractional padding to add to x and y axis limits. |
0.02, 0.05
|
y_pad_frac
|
float
|
Fractional padding to add to x and y axis limits. |
0.02, 0.05
|
min_x_pad
|
float
|
Minimum padding to add to x and y axis limits. |
0.005, 1.0
|
min_y_pad
|
float
|
Minimum padding to add to x and y axis limits. |
0.005, 1.0
|
adjust_text_kwargs
|
dict
|
Additional keyword arguments to pass to adjust_text. |
None
|
ax
|
Axes
|
Matplotlib Axes to plot on; if None, a new figure and axes are created. |
None
|
dpi
|
int
|
Resolution of the figure in dots per inch. |
150
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_feature_metrics(df_mode, mode_name, x_col='weighted_Psum', y_col='sepLFC', sep_col='sepCls', annot_mask=None, xmax=None, ymax=None, custom_color_dict=None)
Scatter plot of feature metrics for a given mode, colored by separating class pattern.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_mode
|
DataFrame
|
DataFrame containing feature metrics for the mode. Must include 'sepCls', x_col, and y_col. |
required |
mode_name
|
str
|
Name of the mode (for title). |
required |
x_col
|
str
|
Column name for x-axis metric (default is 'weighted_Psum'). |
'weighted_Psum'
|
y_col
|
str
|
Column name for y-axis metric (default is 'sepLFC'). |
'sepLFC'
|
annot_mask
|
Series or None
|
Boolean mask for annotating points (default is None). |
None
|
xmax
|
float or None
|
Maximum x-axis limit (default is None, which auto-scales). |
None
|
ymax
|
float or None
|
Maximum y-axis limit (default is None, which auto-scales). |
None
|
custom_color_dict
|
dict or None
|
Custom color dictionary for 'sepType' categories (default is None). |
None
|
Returns:
| Type | Description |
|---|---|
None
|
|
plot_feature_scatter(df, *, mode_name=None, x='weighted_Psum', y='sepLFC', highlight=None, ax=None)
Scatter plot of feature metrics, e.g. weighted_Psum vs sepLFC.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Must contain columns |
required |
mode_name
|
str
|
For titling; purely cosmetic. |
None
|
x
|
str
|
Column names in df to use as axes. |
'weighted_Psum'
|
y
|
str
|
Column names in df to use as axes. |
'weighted_Psum'
|
highlight
|
iterable of str
|
Feature names (index values) to annotate. |
None
|
ax
|
Axes
|
|
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_feature_sepLFC_across_modes(res_model, df_pvs_modes, selected_feature, feature_names, colors, *, label_rank=True, dpi=150, ax=None)
Horizontal bar plot of sepLFC for a focal gene across all modes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_model
|
ClumpplingResults-like object, with attributes: - modes: list of mode names - mode_K: dict[mode_name -> K] - P_aligned_by_mode: dict[mode_name -> P matrix] (not used, but available) |
required | |
df_pvs_modes
|
Mapping[str, 'pd.DataFrame']
|
Dict mapping mode_name -> DataFrame with columns ['sepLFC', 'sepCls'].
Row order must align with |
required |
selected_feature
|
str
|
Feature name to plot. |
required |
feature_names
|
Sequence[str]
|
Sequence of all feature names; selected_feature must be in this list. |
required |
colors
|
Sequence
|
Sequence of colors indexed by cluster index (0-based). |
required |
label_rank
|
bool
|
If True, annotate each bar with the rank of the focal gene by sepLFC. |
True
|
dpi
|
int
|
Figure DPI. |
150
|
ax
|
Optional[Axes]
|
Optional existing Axes to plot into. |
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
Matplotlib Figure and Axes. |
plot_mode_P_profile(results, mode_name, ax=None, title=None, lw=0.2)
For a single mode, compute the clustering profile and plot sorted log P.
plot_sepLFC_dist(results, mode_name, *, lfc_threshold=10.0, ax=None, title=None)
For a single mode, plot distribution of sepLFC by 'how many clusters are separated' (index of sorted cluster before the sepLFC gap).
plot_sepLFC_labels(df_selected, modes, *, sepLFC_threshold=0.0, cmap='Reds', vmin=1e-05, vmax=None, y_max=40.0, hi_sepLFC_threshold=32.0, n_top_hi=15, n_top_lo=8, figsize_scale=0.95, dpi=150)
For each mode in modes, plot:
- a vertical axis of sepLFC values,
- a rug plot of all genes with sepLFC > sepLFC_threshold,
- labeled horizontal lines for the top sepLFC genes, colored by weighted_Psum,
- all panels share a single horizontal colorbar (weighted_Psum) on top.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_selected
|
DataFrame
|
Wide table with columns like: - weighted_Psum_{mode_name} - sepLFC_{mode_name} and index = gene IDs. |
required |
modes
|
sequence of str
|
Mode names used to derive the column suffixes. |
required |
sepLFC_threshold
|
float
|
Only genes with sepLFC > threshold are included per mode. |
0.0
|
cmap
|
str
|
Colormap used to encode weighted_Psum. |
"Reds"
|
vmin
|
float
|
For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold. |
1e-05
|
vmax
|
float
|
For LogNorm. If vmax is None, it's computed from df_selected across all modes and sepLFC > sepLFC_threshold. |
1e-05
|
y_max
|
float
|
ymax used for y-axis; also used in label positioning logic. |
40.0
|
hi_sepLFC_threshold
|
float
|
If the top sepLFC in a mode exceeds this, up to |
32.0
|
n_top_hi
|
int
|
See above. |
15
|
n_top_lo
|
int
|
See above. |
15
|
figsize_scale
|
float
|
Scale factor for figure width: width = figsize_scale * len(modes). |
0.95
|
dpi
|
int
|
|
150
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
axes_by_mode |
dict
|
Mapping mode_name -> Axes for that panel. |
plot_spatial_membership(Q, coords, ref_color, *, cls_idx=0, ax=None, val_threshold=0.0, vmin=0.0, vmax=1.0, s=1.0, alpha=1.0, title=None, keep_ticks=False)
Plot a single colored scatter layer of 2D coordinates weighted by membership.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
Q
|
array - like
|
Either an (n_cells, K) membership matrix, or an (n_cells,) vector. |
required |
coords
|
array - like
|
Either: - (n_cells, 2) array of [x, y] coordinates, or - tuple (x, y) of 1D arrays. |
required |
ref_color
|
color spec
|
Base color for the membership colormap (e.g. cmap(k), 'tab:blue', (r,g,b)). |
required |
cls_idx
|
int
|
If Q is (n_cells, K), which column to use as membership. Ignored if Q is 1D. |
0
|
ax
|
Axes
|
Existing axes to draw on. If None, a new figure and axes are created. |
None
|
val_threshold
|
float
|
Only plot points with membership > val_threshold. |
0.0
|
vmin
|
float
|
Range of membership values for colormap normalization. |
0.0
|
vmax
|
float
|
Range of membership values for colormap normalization. |
0.0
|
s
|
float
|
Marker size. |
1.0
|
alpha
|
float
|
Marker alpha. |
1.0
|
title
|
str
|
Title for the axis (only set if not None). |
None
|
keep_ticks
|
bool
|
If False (default), remove x/y ticks. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
ax |
Axes
|
|
sp |
PathCollection
|
The scatter object. |
Model Comparison
plot_comparison.py
Multi-model comparison visualizations.
Classes
Functions
plot_avg_membership_barh(avg_cls_memberships, *, annot_col='annot', model_order=None, cluster_order=None, cluster_mode='auto', colors=None, figsize_per_cluster=(2.2, 4.0), dpi=150)
Generalized horizontal grouped bar charts comparing cluster memberships across multiple models.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
avg_cls_memberships
|
Dict[str, DataFrame]
|
Dict of {model_name: df}.
Each df: rows = annotation groups, columns = clusters,
plus an |
required |
annot_col
|
str
|
Column name for annotation groups. |
'annot'
|
model_order
|
Optional[Sequence[str]]
|
Order of modalities in the legend/hue. If None, uses dict insertion order. |
None
|
cluster_order
|
Optional[Sequence[str]]
|
Optional explicit order for clusters (subset will be used). |
None
|
cluster_mode
|
Literal['intersection', 'union', 'auto']
|
How to determine clusters across models: - "intersection": use only clusters present in all models - "union": use all clusters across models - "auto": use intersection if non-empty else union |
'auto'
|
colors
|
Optional[Sequence[str]]
|
Optional colors used for cluster title text (per cluster index). |
None
|
figsize_per_cluster
|
Tuple[float, float]
|
(width_per_cluster, height) for 1 x K layout. |
(2.2, 4.0)
|
dpi
|
int
|
Figure DPI. |
150
|
Returns:
| Type | Description |
|---|---|
(fig, axes)
|
|
plot_compmodels_Q_grid(comp_res, coords, models=None, models_plot_order=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)
Plot membership on 2D coords (e.g. UMAP) for all modes in each model.
Layout: columns = models, rows = modes within each model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comp_res
|
CompModelsResults
|
Loaded comparison results (from io.load_compmodels_results). |
required |
coords
|
array - like
|
(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode. |
required |
models
|
list of str
|
Subset of models to include; defaults to all in comp_res.models. |
None
|
models_plot_order
|
list of str
|
Order of columns; if None, uses |
None
|
val_threshold
|
float
|
Membership threshold below which points are omitted for each cluster. |
0.5
|
s
|
float
|
Marker size passed to plot_spatial_membership. |
0.05
|
colors
|
Sequence
|
Sequence of colors used for clusters; default is tab20. |
None
|
figsize_scale
|
(float, float)
|
Scale factors for figure size: (width_per_col, height_per_row). |
(2.5, 2.0)
|
suptitle
|
str
|
Overall figure title. |
None
|
y_suptitle
|
float
|
y position of suptitle. |
0.92
|
plot_compmodels_Q_selected(comp_res, coords, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, s=0.05, colors=None, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92)
Plot membership on 2D coords (e.g. UMAP) for a selected set of modes.
Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comp_res
|
CompModelsResults
|
Loaded comparison results (from io.load_compmodels_results). Must have attributes: - Q_by_mode : dict[full_mode_name -> ndarray (n_cells, K)] - mode_stats_by_model : dict[model_name -> DataFrame] with index 'Mode' (short_mode) and column 'Size' |
required |
coords
|
array - like
|
(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode. |
required |
model_mode_list
|
sequence of (model_name, short_mode)
|
List of specific modes to plot, e.g. [("rna.seurat.louvain", "K20M1"), ("rna.seurat.louvain", "K20M2"), ("rna.scanpy.leiden", "K18M1")] |
required |
n_rows
|
int
|
Number of rows in the grid. If None and n_cols is None, uses 1 row. |
None
|
n_cols
|
int
|
Number of columns in the grid. If None and n_rows is None, uses len(model_mode_list) columns (single row). |
None
|
val_threshold
|
float
|
Membership threshold below which points are omitted for each cluster. |
0.5
|
s
|
float
|
Marker size passed to plot_spatial_membership. |
0.05
|
colors
|
Sequence
|
Sequence of colors used for clusters; default is tab20. |
None
|
figsize_scale
|
(float, float)
|
Scale factors for figure size: (width_per_col, height_per_row). |
(2.5, 2.0)
|
suptitle
|
str
|
Overall figure title. |
None
|
y_suptitle
|
float
|
y position of suptitle. |
0.92
|
Returns:
| Type | Description |
|---|---|
(fig, axes_by_model_mode)
|
fig : matplotlib.figure.Figure axes_by_model_mode : dict[(model_name, short_mode) -> Axes] |
plot_compmodels_alignment_by_model(comp_res, cmap=None, *, models=None, models_plot_order=None, row_by_K=False, wspace_padding=1.3, marker_size=200.0, alt_ls=False, ls_alt=('-', '--'), lw=1.0, connect_identity=False, adjacent_only=True, label_modes=True, figsize_scale=(0.3, 2), dpi=150, pair_mappings=None)
Plot alignment between multiple models in a single graph.
Modes can be arranged in rows either by: - mode index within each model (row_by_K=False), or - grouped by K across models (row_by_K=True), so that modes with the same K value line up on the same row "band" across models.
When row_by_K=True: - For each K, determine the maximum number of modes with that K across all selected models. - Allocate that many rows for that K. - If a model has fewer modes for that K, the corresponding slots are left empty (no markers drawn).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comp_res
|
CompModelsResults
|
Must provide: - models - modes_by_model: dict[model -> list[str]] (short mode names) - full_mode_names: list[str] (e.g. "rna.seurat.K21M1") - all_modes_alignment: dict[full_mode_name -> list[int]] - alignment_across_all: dict["A-B" -> mapping list[int]] - K_max: int - get_Q(full_mode_name) or Q_by_mode[full_mode_name] |
required |
cmap
|
Either: - a matplotlib colormap (e.g. cm.get_cmap("tab20")) - a sequence of RGB tuples - None (defaults to tab20). |
None
|
|
models
|
sequence of str
|
Subset of models to include. Defaults to all comp_res.models. |
None
|
models_plot_order
|
sequence of str
|
Order of columns. Defaults to |
None
|
row_by_K
|
bool
|
If True, modes are grouped by K across models; only modes with the same K appear in the same row band. If False, rows are mode index per model. |
False
|
wspace_padding
|
float
|
Horizontal spacing factor between model columns, scaled by K_max. |
1.3
|
marker_size
|
float
|
Size of the cluster markers. |
200.0
|
alt_ls
|
bool
|
If True, use |
False
|
ls_alt
|
sequence of str
|
Line styles; ls_alt[0] used for non-identity edges, ls_alt[1] used for identity edges (if connect_identity=True). |
('-', '--')
|
lw
|
float
|
Line width for edges. |
1.0
|
connect_identity
|
bool
|
If True, also draw thin light-grey (or ls_alt[1]) lines for identity mappings (same aligned column index). If False, only draw non-identity. |
False
|
adjacent_only
|
bool
|
If True, draw edges only between modes in adjacent model columns (to reduce clutter). If False, draw edges between any model pair. |
True
|
label_modes
|
bool
|
If True, write mode labels near each block; column headers = models. |
True
|
figsize_scale
|
(float, float)
|
Scale factors for figure size: (width_per_K, height_per_row). Width = n_models * K_max * width_per_K Height = n_rows * height_per_row |
(0.3, 2)
|
dpi
|
int
|
Figure dpi. |
150
|
pair_mappings
|
dict
|
Optional within-model pair mappings ("A-B" -> list[(col_idx_A, col_idx_B)]) to draw extra edges between successive modes of the same model. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
The figure object. |
ax |
Axes
|
The axes. |
plot_compmodels_alignment_list(comp_res, cmap=None, marker_size=250, figsize=(6, 6))
CompModels alignment pattern list using clumppling.plot_alignment_list, but with correct K-grouped ordering to avoid KeyError.
Requires comp_res to have: - full_mode_names - alignment_across_all - all_modes_alignment - get_Q(full_mode_name)
plot_compmodels_diff_grid(comp_res, pair_mappings, coords, ref_mode, models_plot_order=None, val_threshold=0.5, diff_threshold=0.5, *, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)
Plot difference in membership on 2D coords for all modes across models.
- Use map_alt_to_ref to compute aligned differences.
- For non-ref panels, plot a single overlaid diff scatter (per-cell).
- Compute Δ = fraction(per_cell_diff > diff_threshold)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comp_res
|
CompModelsResults
|
Loaded comparison results (from io.load_compmodels_results). |
required |
pair_mappings
|
dict
|
Dict mapping "ref_mode-alt_mode" -> list of (ref_k, alt_k) tuples. |
required |
coords
|
array - like
|
(n_cells, 2) or (x, y) tuple; same individuals as in Q_by_mode. |
required |
ref_mode
|
str
|
Full mode name (e.g. "model_shortmode") to use as reference. |
required |
models_plot_order
|
list of str
|
Order of models (columns); if None, uses all models in comp_res. |
None
|
val_threshold
|
float
|
Membership threshold below which points are omitted for each cluster. |
0.5
|
diff_threshold
|
float
|
Threshold for difference in membership to consider significant. |
0.5
|
colors
|
Sequence
|
Sequence of colors used for clusters; default is tab20. |
None
|
s
|
float
|
Marker size passed to plot_spatial_membership. |
0.05
|
alpha
|
float
|
Alpha value for scatter points. |
0.6
|
figsize_scale
|
(float, float)
|
Scale factors for figure size: (width_per_col, height_per_row). |
(2.5, 2.0)
|
suptitle
|
str
|
Overall figure title. |
None
|
y_suptitle
|
float
|
y position of suptitle. |
0.92
|
strict_pair_mapping
|
bool
|
If True, raise an error if a required pair mapping is missing. |
True
|
plot_compmodels_diff_selected(comp_res, pair_mappings, coords, ref_mode, model_mode_list, *, n_rows=None, n_cols=None, val_threshold=0.5, diff_threshold=0.5, colors=None, s=0.05, alpha=0.6, figsize_scale=(2.5, 2.0), suptitle=None, y_suptitle=0.92, strict_pair_mapping=True)
Plot difference in membership on 2D coords for a selected set of modes. Layout: one panel per (model, mode) in model_mode_list. Grid size can be specified by n_rows / n_cols; otherwise defaults to a single row. Parameters follow same pattern as 'plot_compmodels_diff_grid'.
plot_discrete_colorbar(colors, K_max=None, *, labels=None, ax=None, figsize=None, dpi=150, facecolor='white')
Plot a simple discrete colorbar-like strip for cluster colors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
colors
|
Sequence[ColorSpec]
|
A sequence of color specs. Can be: - list of RGB tuples (0-1 range) - hex strings - named matplotlib colors |
required |
K_max
|
Optional[int]
|
Number of clusters. If None, inferred as len(colors). |
None
|
labels
|
Optional[Sequence[str]]
|
X tick labels. If None, defaults to ["Cls.1", ..., "Cls.K"]. |
None
|
ax
|
Optional[Axes]
|
Existing axes to draw on. If None, a new figure/axes is created. |
None
|
figsize
|
Optional[Tuple[float, float]]
|
Figure size (only used if ax is None). Default scales with K. |
None
|
dpi
|
int
|
Figure dpi (only used if ax is None). |
150
|
facecolor
|
str
|
Figure/axes facecolor. |
'white'
|
Returns:
| Type | Description |
|---|---|
(fig, ax, im)
|
im is the AxesImage returned by imshow. |
plot_feature_cluster_panels(results, coords, df_pvs_modes, selected_feature, *, modes=None, colors=None, plot_both_sides=False, val_threshold=0.0, w_scale=1.2, h_scale=1.4, dpi=150, suptitle=None)
Plot spatial membership for separated clusters for a single focal gene across multiple modes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
Object with |
required | |
coords
|
array - like
|
(n_cells, 2) or (x, y) tuple for spatial / UMAP coordinates. |
required |
df_pvs_modes
|
dict[str, DataFrame]
|
Mapping: mode_name -> DataFrame with index including |
required |
selected_feature
|
str
|
Feature name / index key used in df_pvs_modes[mode].loc[selected_feature]. |
required |
modes
|
sequence of str
|
Subset / order of modes to plot. Defaults to all keys in df_pvs_modes
that contain |
None
|
colors
|
Either a sequence of colors indexable by cluster index, or a colormap. If None, defaults to tab20. |
None
|
|
plot_both_sides
|
bool
|
If False: plot only the “fewer” side clusters across modes in a big [modes × all_sepCls] grid. If True: for each mode, left = sepCls[0], right = sepCls[1], separated by a vertical dashed line. |
False
|
val_threshold
|
float
|
Membership threshold passed to plot_spatial_membership. |
0.0
|
w_scale
|
float
|
Width/height scaling factors for figure size. |
1.2
|
h_scale
|
float
|
Width/height scaling factors for figure size. |
1.2
|
dpi
|
int
|
Figure DPI. |
150
|
suptitle
|
str or None
|
Optional figure-level title. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
axes |
dict[(mode_name, col_idx) -> Axes]
|
|
plot_feature_count(feature_counts, coords, *, feature_name='', log_transformed=True, vmax=6, vmin=None, size=5, cmap='RdYlBu_r', cbar_loc='bottom', cbar_label=None, ax=None)
Plot a single gene's expression over 2D coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
feature_counts
|
array - like or sparse
|
Per-cell values for one feature. Shape (n_cells,) or (n_cells, 1). If sparse, will be densified. |
required |
coords
|
ndarray
|
2D coordinates of shape (n_cells, 2). |
required |
feature_name
|
str or None
|
Title annotation. |
''
|
log_transformed
|
bool
|
If False, apply log1p to feature_counts. If True, assume feature_counts already on log scale. |
True
|
vmax
|
float or None
|
Color max. If None or 0, inferred from data. |
6
|
vmin
|
float or None
|
Color min. If None, inferred by matplotlib. |
None
|
size
|
float
|
Point size. |
5
|
cmap
|
str
|
Colormap name. |
'RdYlBu_r'
|
cbar_loc
|
('bottom', 'top', 'left', 'right')
|
Colorbar location. |
"bottom"
|
cbar_label
|
str or None
|
Overrides default colorbar label. |
None
|
ax
|
Axes or None
|
Existing axis to draw on. |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
|
plot_group_diff(df_mode_group_diff, *, mode_sizes=None, annotation_group_sizes=None, ref_mode=None, show_top=True, show_left=True, annot=None, cmap='Reds', cbar_label='Fraction of different cells', top_ylabel='#cells in the group', left_xlabel='Mode size', x_label='Annotation groups', y_label='Modes', figsize=(10, 8), dpi=300, height_ratios=(1, 6), width_ratios=(1.5, 6), wspace=0.01, hspace=0.01, vmin=0.0, vmax=1.0, cbar_fraction=0.6, xtick_rotation=45, xtick_fontsize=8, ytick_fontsize=8, label_fontsize=7, border_width=0.5, border_color='lightgray', zero_label_eps=0.001, top_round_to=500, show_mode_size_labels=True, add_model_separators=True, model_sep_kwargs=None)
Plot a heatmap of mode-by-annotation-group differences. Optionally add marginal bar plots: - Top: annotation group sizes - Left: mode sizes
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_mode_group_diff
|
DataFrame
|
DataFrame with index=modes and columns=annotation groups. |
required |
mode_sizes
|
Optional[Series]
|
Series of mode sizes indexed by FULL mode names. Required if show_left=True. |
None
|
annotation_group_sizes
|
Optional[Series]
|
Series of group sizes indexed by group names. Required if show_top=True. |
None
|
ref_mode
|
Optional[str]
|
If provided, highlights this row label in red/bold. |
None
|
show_top
|
bool
|
Toggle marginal bars. |
True
|
show_left
|
bool
|
Toggle marginal bars. |
True
|
Returns:
| Type | Description |
|---|---|
(fig, axes)
|
axes is a dict with keys: "heatmap", "top", "left" |
plot_mapping_alignment(*, pair_mapping, ref_K, alt_K, ref_mode, alt_mode, colors, figsize=(5, 2), dpi=150, node_size=150, node_edgecolor='black', node_linewidth=0.5, line_color='k', line_alpha=0.5, line_lw=1.0, ax=None, title=None)
Plot a simple two-row pair-mapping alignment diagram.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pair_mapping
|
Sequence[Tuple[int, int]]
|
Sequence of (c_ref, c_alt) index pairs. Assumes ref row at y=1, alt row at y=0. |
required |
ref_K
|
int
|
Number of clusters in ref/alt spaces for x-limit. |
required |
alt_K
|
int
|
Number of clusters in ref/alt spaces for x-limit. |
required |
ref_mode
|
str
|
Labels for y-axis and title. |
required |
alt_mode
|
str
|
Labels for y-axis and title. |
required |
colors
|
Union[Sequence, Mapping[int, str]]
|
Colors indexed by cluster id. Can be a list/tuple or dict. |
required |
ax
|
Optional[Axes]
|
If provided, draws into existing axis. |
None
|
Returns:
| Type | Description |
|---|---|
(fig, ax)
|
|
plot_mapping_grid(*, ref_Q, alt_Q, pair_mappings, ref_mode, alt_mode, coords, colors, show=('alt',), dpi=150, s=0.5, figsize_scale=(2.0, 2.0), strict_pair_mapping=True, connect_lines=True, connect_color='k', connect_alpha=0.25, connect_lw=0.8)
Plot reference/alt Qs, mapped alt, and per-column abs differences.
Row order (when included): 0) reference (the smaller-K space used for mapping) 1) mapped alt (larger-K mapped into smaller-K space) 2) original alt (the larger-K Q) 3) diff (abs(reference - mapped_alt))
The show argument controls which of:
{"alt", "mapped_alt", "diff"}
are added in addition to the reference row.
Default: show=("alt",) -> reference + original alt
plot_membership_reordered(P, cmap, lbs, ax, title='', annot='')
Plot membership with reordered cluster indices. P : np.ndarray, shape (n_samples, n_clusters) Membership matrix. cmap : list of colors Colors for each cluster. lbs : array-like, shape (n_samples,) Labels for each sample (used to group samples). ax : matplotlib.axes.Axes Axis to plot on. title : str Y-axis label. annot : str Annotation text (shown at top-right).
plot_model_diff_heatmap(cross_model_overall_membership_diff, comp_res, models, *, figsize=(9, 8), dpi=150, cmap='Reds', decimals=2, vmin=0.0, vmax=1.0, linewidths=0.5, linecolor='white', ax=None, cbar=True, annot_size=8, tight_layout=True, show=False)
Plot a cross-model overall membership difference heatmap.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cross_model_overall_membership_diff
|
Union[Mapping[Tuple[str, str], float], Series]
|
Dict-like or pd.Series with keys as (mode_name_model0, mode_name_model1) and values in [0, 1]. |
required |
comp_res
|
Any
|
An object that contains: comp_res.full_mode_names_by_model[model_name] -> ordered list of full mode names. This ordering is used to reindex rows/cols (no lexical sorting issues like K10 vs K3). |
required |
models
|
Sequence[str]
|
Sequence of two model names in the same order used in the diff keys. |
required |
figsize
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
dpi
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
cmap
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
annot
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
vmin
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
vmax
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
linewidths
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
linecolor
|
Tuple[float, float]
|
Seaborn/Matplotlib styling options. |
(9, 8)
|
ax
|
Optional[Axes]
|
If provided, plot into this axis. Otherwise create a new figure/axis. |
None
|
cbar
|
bool
|
Whether to show colorbar. |
True
|
tight_layout
|
bool
|
Whether to call plt.tight_layout(). |
True
|
show
|
bool
|
If True, calls plt.show(). |
False
|
Returns:
| Type | Description |
|---|---|
(fig, ax, mat)
|
The figure, axis, and the reindexed matrix used for the heatmap. |
plot_model_diff_summary(comp_res, mat_diffs, coords, models_plot_order=None, *, colors=None, figsize_scale=(2.5, 2.5), diff_cmap='RdPu', diff_vmin=0.0, diff_vmax=1.0, point_size=2.0, alpha=1.0, suptitle=None)
For each model, plot:
- Top row: major mode clustering (largest Size) on 2D coords,
colored by discrete cluster labels using colors,
analogous to plot_compmodels_Q_grid.
- Bottom row: per-cell weighted average difference vs reference,
aggregated across modes and weighted by mode size.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comp_res
|
Object containing compModels results. Must have: - modes_by_model : Dict[str, List[str]] (short mode names, e.g. "K20M1") - mode_stats_by_model : Dict[str, DataFrame] with columns ['Mode', 'Size'] - get_Q(full_mode_name) -> np.ndarray (n_cells x K) |
required | |
mat_diffs
|
dict
|
Nested dict of diff matrices, typically from
get_diff_matrices:
|
required |
coords
|
array - like or (x, y)
|
2D coordinates per cell. Either: - array of shape (n_cells, 2), or - tuple/list (x, y) of 1D arrays. |
required |
models_plot_order
|
sequence of str
|
Order of models (columns). Defaults to list(mat_diffs.keys()). |
None
|
colors
|
sequence
|
Sequence of discrete colors used for clusters in the TOP row, same semantics as in plot_compmodels_Q_grid. If None, defaults to tab20 colors. |
None
|
figsize_scale
|
(float, float)
|
(width_per_model, height_per_row) used to derive overall figure size. |
(2.5, 2.5)
|
diff_cmap
|
str
|
Colormap for the weighted difference panel (bottom row). |
"RdPu"
|
diff_vmin
|
float
|
vmin/vmax for the difference colormap. |
0.0
|
diff_vmax
|
float
|
vmin/vmax for the difference colormap. |
0.0
|
point_size
|
float
|
Scatter point size. |
2.0
|
alpha
|
float
|
Scatter alpha. |
1.0
|
suptitle
|
str or None
|
Optional figure-level title. |
None
|
Returns:
| Type | Description |
|---|---|
(fig, axes)
|
Matplotlib Figure and Axes array of shape (2, n_models). |
plot_spatial_structure_grid(results, coords, grps, *, modes=None, cmap=None, mode_labels=None, grp_seps=None, reorder_cls=True, s=1.0, alpha=1.0, vmin=0.0, vmax=1.0, figsize=None, dpi=150)
Optimized spatial + structure membership grid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
|
required |
coords
|
(n_cells, 2)
|
|
required |
grps
|
per-cell group labels for ordering the 1D trace
|
|
required |
modes
|
optional subset of modes (defaults to results.modes)
|
|
None
|
cmap
|
Optional[Any]
|
|
None
|
grp_seps
|
Optional[Sequence[float]]
|
optional separators for group boundaries on the structure plot. If None, computed from sorted grps. |
None
|
reorder_cls
|
bool
|
if True, place clusters by aligned index. |
True
|
plot_structure_one_level(results, *, modes=None, cmap=None, grp_labels=(), mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, lb_suffix_sep=None, figsize=None, dpi=150, x_rot=0, x_ha='center')
One-level group version of plot_structure_modes.
- Pulls Q matrices from
results.Q_by_mode. - Computes grp_info inside using get_uniq_lb_sep.
- Works for any single-level labels (e.g., sample group / batch / cell type).
- Optionally reorders samples by grp_labels via plot_membership_reordered.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults-like object with attributes: - Q_by_mode : dict[mode_name -> (n_cells, K) array] - modes : sequence of mode names (if modes is None) |
required | |
modes
|
Optional[Sequence[str]]
|
Which modes to plot. If None, uses |
None
|
cmap
|
Colormap list passed to plot_membership / plot_membership_reordered. |
None
|
|
grp_labels
|
Sequence[str]
|
Group labels per sample (length n_cells). |
()
|
mode_labels
|
Optional[Sequence[str]]
|
Labels for each mode row (defaults to |
None
|
reorder_clsind
|
bool
|
If True, use plot_membership_reordered(Q, cmap, grp_labels, ...); otherwise use plot_membership(Q, cmap, ...). |
True
|
grp_seps_ymin
|
float
|
How far separator lines extend below axis (in axis fraction). |
-0.2
|
lb_suffix_sep
|
Optional[str]
|
Optional separator; if provided, only the suffix (after lb_suffix_sep) is used in x tick labels. |
None
|
figsize
|
Optional[Tuple[float, float]]
|
Figure size (width, height). If None, chosen based on number of modes. |
None
|
dpi
|
int
|
Figure DPI. |
150
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
plot_structure_two_level(results, *, modes=None, cmap=None, grp_labels=(), supgrp_labels=None, mode_labels=None, reorder_clsind=True, grp_seps_ymin=-0.2, supgrp_seps_ymin=-0.6, lb_suffix_sep=None, figsize=None, dpi=150)
Two-level group version of plot_structure_modes.
- Pulls Q matrices from
results.Q_by_mode. - Computes grp_info inside using get_uniq_lb_sep.
- Works for any two-level labels (grp + optional supgrp).
- Optionally reorders samples by (supgrp, grp).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
results
|
ClumpplingResults
|
ClumpplingResults. |
required |
modes
|
Optional[Sequence[str]]
|
Which modes to plot. If None, uses |
None
|
cmap
|
Colormap list passed to plot_membership. |
None
|
|
grp_labels
|
Sequence[str]
|
Lower-level labels per sample (length n_cells). |
()
|
supgrp_labels
|
Optional[Sequence[str]]
|
Higher-level labels per sample (length n_cells), optional. |
None
|
grp_seps_ymin
|
float
|
How far separator lines extend below axis. |
-0.2
|
supgrp_seps_ymin
|
float
|
How far separator lines extend below axis. |
-0.2
|
strip_leading_zero(x, decimals=2)
Format a float to a string with given decimals, stripping leading zero.
Gene Set Enrichment
plot_enrichment.py
Gene set enrichment visualizations.
Classes
Functions
plot_LFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap)
Fill a mode-grid figure with pairwise LFC z-score heatmaps.
Iterates over modes and calls plot_pairwise_heatmap with the LFC
z-score matrix into the corresponding axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
ax_by_mode
|
dict
|
Mapping |
required |
results
|
ClumpplingResults
|
Used to look up K and generate cluster labels. |
required |
cb_cmap
|
list
|
Per-cluster color list (passed to |
required |
plot_LFC_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='coolwarm', center_zero=True, figsize=None, dpi=150, title=None, ax=None)
Single heatmap of pairwise LFC enrichment across all modes.
Rows = modes, columns = cluster pairs (i < j) ordered lexicographically up to K_max. The same pair (e.g. C1 vs C2) occupies the same column for every mode, making it easy to compare enrichment of a given pair across modes. Cells are NaN (blank) for pairs that exceed a mode's K. Cells where q < sig_level are annotated with *.
Pairs are grouped visually by their first cluster index with light vertical separators; the secondary x-axis labels each group "Cv" (e.g. C1v).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
results
|
Clumppling results object with |
required | |
value
|
('z', 'obs')
|
Which LFC quantity to colour: z-score or observed LFC. |
"z"
|
sig_level
|
float
|
Significance threshold for * annotation (applied to q values). |
0.05
|
cmap
|
str
|
|
'coolwarm'
|
center_zero
|
bool
|
Symmetric colour scale around 0. |
True
|
figsize
|
tuple
|
Defaults to |
None
|
title
|
str
|
|
None
|
plot_P_enrichment_by_cluster(res_by_mode, results, cb_cmap, kind='pval', ncols=None, figsize_per_panel=(3.0, 2.8), dpi=150, sig_threshold=None)
One subplot per cluster; each panel shows that cluster's P enrichment across all modes that contain it.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
results
|
ClumpplingResults
|
Used to look up K per mode (via |
required |
cb_cmap
|
list
|
Per-cluster colour list; cluster k gets |
required |
kind
|
('pval', 'zscore')
|
|
"pval"
|
ncols
|
int or None
|
Columns in the subplot grid. Defaults to K_max. |
None
|
figsize_per_panel
|
(float, float)
|
Width × height for each individual subplot. |
(3.0, 2.8)
|
dpi
|
int
|
|
150
|
sig_threshold
|
float or None
|
Threshold for the reference line. For |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
axes |
np.ndarray of matplotlib.axes.Axes, shape (nrows, ncols)
|
|
plot_P_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='pval')
Fill a mode-grid figure with per-cluster P enrichment bars.
Iterates over modes and calls either plot_P_enrichment_pval or
plot_P_enrichment_zscore into the corresponding axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
ax_by_mode
|
dict
|
Mapping |
required |
results
|
ClumpplingResults
|
Used to look up K and generate cluster labels via
|
required |
cb_cmap
|
list
|
Per-cluster color list passed to |
required |
kind
|
('pval', 'zscore')
|
Whether to plot empirical p-values or z-scores. Default |
"pval"
|
plot_P_enrichment_heatmap(res_by_mode, results, value='z', sig_level=0.05, cmap='OrRd', center_zero=False, figsize=None, dpi=150, title=None, ax=None)
Single heatmap of per-cluster P enrichment across all modes.
Rows = modes, columns = clusters (C1 … CK_max). Each cell shows the
P enrichment z-score (value="z") or empirical p-value
(value="p") for that cluster in that mode. Cells exceeding a mode's
K are shown as NaN. Cells where p_emp < sig_level are annotated with *.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
results
|
Clumppling results object with |
required | |
value
|
('z', 'p')
|
Which quantity to colour: z-score or empirical p-value. |
"z"
|
sig_level
|
float
|
Significance threshold for * annotation (applied to p_emp). |
0.05
|
cmap
|
str
|
Defaults to |
'OrRd'
|
center_zero
|
bool
|
Symmetric colour scale around 0. Default False (z-scores are typically positive for enrichment). |
False
|
figsize
|
tuple
|
|
None
|
title
|
str
|
|
None
|
plot_P_enrichment_pval(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)
Bar chart of empirical p-values (-log10) per cluster.
plot_P_enrichment_zscore(p_res, cluster_labels, colors, title='', figsize=(4, 4), dpi=150, ax=None)
Bar chart of z-scores vs null per cluster.
plot_gene_P_bars(P_gs, gene_set, cluster_labels, colors, top_n=None, gene_label_colors=None)
Per-cluster waterfall bars showing each gene's loading within each cluster.
Genes are ranked by loading within each cluster and drawn as horizontal bars colored by cluster.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
P_gs
|
(ndarray, shape(n_gs, K))
|
Gene-set rows of the aligned P matrix. |
required |
gene_set
|
list of str
|
Gene names corresponding to rows of |
required |
cluster_labels
|
list of str
|
Labels for each cluster (columns of |
required |
colors
|
list of str
|
One color per cluster used to fill the bars. |
required |
top_n
|
int or None
|
If set and |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
axes |
np.ndarray of matplotlib.axes.Axes
|
Array of K axes, one per cluster. |
plot_gene_P_stacked(P_gs, gene_set, cluster_labels, gs_title='', log_scale=True, sort_by_sum=False, top_n=None, gene_colors=None, figsize=(6, 4), dpi=150)
Stacked bar chart of per-gene P values across clusters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
P_gs
|
ndarray
|
Shape (n_gs, K). Gene-set rows of the P matrix. |
required |
gene_set
|
list of str
|
Gene names corresponding to rows of P_gs. |
required |
cluster_labels
|
list of str
|
Labels for each cluster (columns of P_gs). |
required |
gs_title
|
str
|
Title prefix for the plot. Default |
''
|
log_scale
|
bool
|
If True, use a log y-axis. Default True. |
True
|
sort_by_sum
|
bool
|
If True, sort clusters in descending order of total P sum. Default False. |
False
|
top_n
|
int or None
|
If set and |
None
|
gene_colors
|
list or None
|
One color per gene (after any |
None
|
figsize
|
tuple of (float, float)
|
Figure size in inches. Default |
(6, 4)
|
dpi
|
int
|
Figure resolution. Default 150. |
150
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax |
Axes
|
|
plot_gene_lfc(df_gene_lfc, cluster_labels, sepL, sepH, gs_sepLFC, colors, figsize=(5, 3), dpi=150, ax=None, kind='mean', top_n=None, show_labels=None)
Horizontal bar chart of per-gene LFC between high and low cluster groups.
Bars are colored on a diverging coolwarm scale centered on zero and
saturated at ±gs_sepLFC.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_gene_lfc
|
DataFrame
|
Output of |
required |
cluster_labels
|
list of str
|
Not currently used; retained for API compatibility. |
required |
sepL
|
list of int
|
Cluster indices in the low group. |
required |
sepH
|
list of int
|
Cluster indices in the high group. |
required |
gs_sepLFC
|
float
|
Observed gene-set sepLFC; sets the colorbar saturation limits. |
required |
colors
|
list of str
|
Not currently used; retained for API compatibility. |
required |
figsize
|
tuple of (float, float)
|
Figure size in inches. Default |
(5, 3)
|
dpi
|
int
|
Figure resolution. Default 150. |
150
|
ax
|
Axes
|
Draw into an existing axes if provided. |
None
|
kind
|
('extreme', 'mean')
|
Determines the x-axis label: |
"extreme"
|
top_n
|
int or None
|
If set and the number of genes exceeds |
None
|
show_labels
|
bool or None
|
Whether to draw gene-name tick labels on the y-axis. If |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax |
Axes
|
|
plot_pairwise_heatmap(value_mat, sig_mat=None, labels=None, title=None, upper_only=True, cmap='coolwarm', center_zero=True, sig_level=0.05, figsize=(7, 6), dpi=150, ax=None)
Heatmap of a KxK matrix with optional significance overlay.
plot_pairwise_heatmap_bidir(upper_mat, lower_mat, upper_cmap='coolwarm', lower_cmap='PuOr', upper_sig=None, lower_sig=None, sig_level=0.05, labels=None, upper_label='', lower_label='', title=None, center_zero=True, figsize=(6, 5), dpi=150, ax=None)
Heatmap with two KxK matrices split across upper and lower triangles.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
upper_mat
|
(K, K) arrays
|
Values for the upper / lower triangle respectively. |
required |
lower_mat
|
(K, K) arrays
|
Values for the upper / lower triangle respectively. |
required |
upper_cmap
|
colormap names
|
|
'coolwarm'
|
lower_cmap
|
colormap names
|
|
'coolwarm'
|
upper_sig
|
(K, K) arrays
|
p-value (or any criterion) matrices; cells where value < sig_level are annotated with *. |
None
|
lower_sig
|
(K, K) arrays
|
p-value (or any criterion) matrices; cells where value < sig_level are annotated with *. |
None
|
sig_level
|
float
|
|
0.05
|
labels
|
list of str
|
|
None
|
upper_label
|
colorbar axis labels
|
|
''
|
lower_label
|
colorbar axis labels
|
|
''
|
center_zero
|
bool
|
If True, color scale is symmetric around 0. |
True
|
plot_per_cluster_P(P_gs, gene_set, cluster_labels, colors, null_mean_P=None, gs_title='', dpi=150)
Super-figure with 1 + K subpanels.
Top row (1 panel spanning all columns):
Scatter of mean gene-set P per cluster overlaid on boxplots of the
null distribution (from sample_null_P), sorted by observed mean P
descending. Each cluster is coloured accordingly. Y-axis is log scale.
If null_mean_P is None, only the scatter is drawn.
Bottom row (K panels):
Per-cluster waterfall plots (gene loadings, cumulative rectangles).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
P_gs
|
(ndarray, shape(n_genes, K))
|
Gene-set rows of the aligned P matrix. |
required |
gene_set
|
list of str
|
Gene names corresponding to rows of P_gs. |
required |
cluster_labels
|
list of str
|
Labels for each cluster (columns of P_gs). |
required |
colors
|
list of str
|
One colour per cluster. |
required |
null_mean_P
|
(ndarray, shape(n_perm, K))
|
Null mean loading vectors from |
None
|
gs_title
|
str
|
Optional title prefix for the top panel. |
''
|
dpi
|
int
|
Figure resolution. Default 150. |
150
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax_top |
Axes
|
|
axes_bottom |
list of matplotlib.axes.Axes, length K
|
|
plot_sepLFC_distribution(df, gs_genes, title='', kind='auto', show_gs_textbox=True, gs_textbox_threshold=None, n_gs_textbox=10, show_non_gs_textbox=False, n_non_gs_textbox=10, textbox_fontsize=9, figsize=(6, 5), dpi=150, ax=None)
Distribution of per-gene sepLFC, with gene-set genes highlighted.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Gene-indexed DataFrame with columns |
required |
gs_genes
|
set or list of str
|
Gene names belonging to the gene set of interest. |
required |
title
|
str
|
Axes title. |
''
|
kind
|
('auto', 'bar', 'hist')
|
Chart type. |
"auto"
|
show_gs_textbox
|
bool
|
Show a text box listing gene-set genes with high sepLFC. Only
rendered in hist mode. Default |
True
|
gs_textbox_threshold
|
float or None
|
Minimum sepLFC for inclusion in the GS text box. When |
None
|
n_gs_textbox
|
int
|
Maximum number of gene-set genes in the GS text box (used when
|
10
|
show_non_gs_textbox
|
bool
|
Show a text box listing the top non-GS genes by sepLFC. Only
rendered in hist mode. Default |
False
|
n_non_gs_textbox
|
int
|
Number of top non-GS genes to list. Default 10. |
10
|
textbox_fontsize
|
int
|
Font size for gene lines inside text boxes. The box title is rendered one point larger and bold. Default 9. |
9
|
figsize
|
tuple of (float, float)
|
Figure size in inches. Default |
(6, 5)
|
dpi
|
int
|
Figure resolution. Default 150. |
150
|
ax
|
Axes
|
Draw into an existing axes if provided. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax |
Axes
|
|
plot_sepLFC_distribution_heatmap(res_by_mode, n_bins=60, cmap='Blues', figsize=(10, 0.45), dpi=150, kind='null_sep', pval_threshold=None, annotate_pval=False)
Heatmap summary of the null sepLFC distribution across all modes.
Each row is one mode; colour encodes the density of the null distribution in each histogram bin. The observed gene-set sepLFC is overlaid as a red dot on each row, making enrichment strength and consistency across modes visible at a glance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
n_bins
|
int
|
Number of histogram bins shared across all modes. Default |
60
|
cmap
|
str or Colormap
|
Colormap for the density heatmap. Default |
'Blues'
|
figsize
|
(float, float)
|
|
(10, 0.45)
|
dpi
|
int
|
|
150
|
kind
|
('null_sep', 'null_fixed')
|
Which null distribution to display.
|
"null_sep"
|
pval_threshold
|
float or None
|
If set to a value |
None
|
annotate_pval
|
bool
|
If |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax |
Axes
|
|
plot_sepLFC_enrichment_grid(res_by_mode, ax_by_mode, results, cb_cmap, kind='null_sep')
Fill a mode-grid figure with sepLFC null-distribution plots.
Iterates over modes and calls either plot_sepLFC_null_sep
or plot_sepLFC_null_fixed into the corresponding axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
res_by_mode
|
dict
|
Output of |
required |
ax_by_mode
|
dict
|
Mapping |
required |
results
|
ClumpplingResults
|
Used to look up K and generate cluster labels. |
required |
cb_cmap
|
list
|
Per-cluster color list (passed to |
required |
kind
|
('null_sep', 'null_fixed')
|
Which null comparison to visualize. |
"null_sep"
|
plot_sepLFC_null_fixed(sep_res, cluster_labels, title='', figsize=(5, 4), dpi=150, ax=None)
Histogram of null LFC at the gene-set's fixed cluster groups, with gene-set value marked.
plot_sepLFC_null_sep(sep_res, title='', figsize=(5, 4), dpi=150, ax=None)
Histogram of null best-sepLFC per random set, with gene-set value marked.
plot_seplfc_bipartite(gene_set, sepH, sepL, cluster_labels, df_mode, top_n_per_pair=5, gs_title='', lw_scale=10.0, min_lw=0.5, seg_gap=0.008, label_mode='auto', label_fontsize=7.0, arrow_fan=0.06, cmap='Spectral', vmin=None, vmax=None, colors=None, figsize=(8, 4), dpi=150, ax=None)
Bipartite diagram where each gene's single segment sits on the one edge
that corresponds to its own best cluster separation (sepLFC in
df_mode).
Unlike the older bipartite approach that assigns every gene to every
H-L edge based on P_gs, this function:
- For each gene, finds the boundary pair
(A, B)where A is the lowest-P cluster in the gene's upper group and B is the highest-P cluster in the gene's lower group (adjacent clusters across the max gap). - Keeps only genes whose boundary pair has A ∈
sepHand B ∈sepL. - On each edge
(A, B), selects the toptop_n_per_pairgenes by theirdf_mode.sepLFCvalue. - Draws each selected gene as a single segment on its assigned edge.
All arrows point from sepH toward sepL (LFC is always positive by construction).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gene_set
|
list of str
|
Gene names (must be a subset of |
required |
sepH
|
list of int
|
Cluster indices in the high group (top row nodes). |
required |
sepL
|
list of int
|
Cluster indices in the low group (bottom row nodes). |
required |
cluster_labels
|
list of str
|
Labels for all K clusters. |
required |
df_mode
|
DataFrame
|
Feature-metrics DataFrame (from |
required |
top_n_per_pair
|
int
|
Maximum number of genes to show per (sepH, sepL) edge. Default 5. |
5
|
gs_title
|
str
|
Title prefix. |
''
|
lw_scale
|
float
|
Maximum line width (for the edge with the largest total sepLFC). |
10.0
|
min_lw
|
float
|
Minimum line width. |
0.5
|
seg_gap
|
float
|
Gap at each end of every gene segment (in t-space). Default 0.008. |
0.008
|
label_mode
|
str
|
Controls gene-name labels on segments. One of:
|
'auto'
|
label_fontsize
|
float
|
Font size for gene-name labels. Default 7.0. |
7.0
|
arrow_fan
|
float
|
Half-width (in data units) of the fan applied at each node so arrowheads from different edges spread out. Default 0.06. |
0.06
|
cmap
|
str
|
Matplotlib colormap name used to color segments by |
'Spectral'
|
vmin
|
float or None
|
Colormap range. |
None
|
vmax
|
float or None
|
Colormap range. |
None
|
colors
|
list or None
|
Per-cluster colors for node fill (indexed by cluster index). |
None
|
figsize
|
tuple of (float, float)
|
Figure size in inches. Default |
(8, 4)
|
dpi
|
int
|
Figure resolution. Default 150. |
150
|
ax
|
Axes
|
Draw into an existing axes if provided. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
fig |
Figure
|
|
ax |
Axes
|
|
plot_top_pairwise_df(df, value_col='z', sig_col='q', alpha=0.05, top_n=-1, labels=None, sort_by='q', figsize=(8, 6), dpi=150, ax=None)
Dot plot of top cluster pairs sorted by significance or effect size.