Skip to content

Clumppling

Wrappers for Clumppling's main and compModels functions.

wrappers.py

Python wrapper(s) around the clumppling CLI.

Usage

run_clumppling( input_dir="/path/to/input", output_dir="/path/to/output", fmt="generalQ", ... )

prepare_compmodels( models=["model1", "model2"], model_dirs=["/path/to/model1_dir", "/path/to/model2_dir"], comp_dir="/path/to/comp_dir", )

run_comp_models( models=["model1", "model2"], comp_dir="/path/to/comp_dir", output_dir="/path/to/comp_models_output", ... )

Functions

prepare_compmodels(models, model_dirs, comp_dir, *, suffixes=None, modes_aligned_subdir='modes_aligned', mode_stats_relpath='modes/mode_stats.txt', exist_ok=True)

Prepare input files for clumppling.compModels from multiple aligned clumppling runs.

Parameters:

Name Type Description Default
models sequence of str

Model names, e.g. ["model1", "model2"]. These will be used to name the output qfilelist, qnamelist, and mode_stats.txt files.

required
model_dirs sequence of path-like

Directories where each model's clumppling output lives. Each directory should contain a subdirectory modes_aligned_subdir (default: "modes_aligned") with the aligned Q files, and a file at mode_stats_relpath (default: "modes/mode_stats.txt") with the mode statistics.

required
comp_dir path - like

Directory where the output qfilelist, qnamelist, and mode_stats.txt files will be written.

required
suffixes sequence of str or None

Suffixes of the Q files for each model, either "rep" or "avg". If None (default), all models are assumed to use the suffix "rep". The length of this sequence must match the length of models.

None
modes_aligned_subdir str

Subdirectory within each model_dir where the aligned Q files are stored.

"modes_aligned"
mode_stats_relpath str

Relative path within each model_dir where the mode_stats.txt file is located.

"modes/mode_stats.txt"
exist_ok bool

If True, do not raise an error if comp_dir already exists.

True

Returns:

Name Type Description
qfilelists list of Path

Paths to the generated qfilelist files, one per model.

qnamelists list of Path

Paths to the generated qnamelist files, one per model.

mode_stats_files list of Path

Paths to the copied mode_stats.txt files, one per model.

Raises:

Type Description
ValueError

If the lengths of models, model_dirs, and suffixes do not match.

FileNotFoundError

If expected Q files or mode_stats.txt files are not found.

run_clumppling(input_dir, output_dir, fmt='generalQ', *, vis=True, custom_cmap='', plot_type='graph', include_cost=True, include_label=True, alt_color=True, ind_labels='', ordered_uniq_labels='', regroup_ind=True, reorder_within_group=True, reorder_by_max_k=True, order_cls_by_label=True, plot_unaligned=False, fig_format='tiff', extension='', skip_rows=0, remove_missing=True, cd_method='louvain', cd_res=1.0, test_comm=True, comm_min=1e-06, comm_max=0.01, merge=True, use_rep=True, use_best_pair=True, setup_logging=True, log_file=None)

Programmatic wrapper around clumppling.main(args).

Parameters:

Name Type Description Default
input_dir PathLike

Directories for clumppling input and output. These correspond to -i/--input and -o/--output in the CLI.

required
output_dir PathLike

Directories for clumppling input and output. These correspond to -i/--input and -o/--output in the CLI.

required
fmt str

Input format: one of {"generalQ", "admixture", "structure", "fastStructure"}. This corresponds to -f/--format.

'generalQ'
vis bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
custom_cmap bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
plot_type bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
include_cost bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
include_label bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
alt_color bool

All other options map directly to the CLI arguments with the same name. See clumppling's parse_args() help text for semantics.

True
setup_logging bool

If True (default), configure the clumppling logger the same way the CLI does, writing to <output_dir>/clumppling.log unless log_file is provided.

True
log_file Optional[PathLike]

Optional explicit path for the log file.

None

Returns:

Name Type Description
args Namespace

The Namespace object passed into clumppling.main(args). This can be useful for debugging/logging.

run_comp_models(models, comp_dir, output_dir, *, fig_format='png', vis=True, custom_cmap='', bg_colors=None, include_sim_in_label=True, ind_labels='', qfilelists=None, qnamelists=None, mode_stats_files=None, setup_logging=True, log_file=None)

Programmatic wrapper around clumppling.compModels.main(args).

Parameters:

Name Type Description Default
models sequence of str

Model names passed to --models.

required
comp_dir path - like

Directory where the .qfilelist, .qnamelist, *_mode_stats.txt live. If qfilelists / qnamelists / mode_stats_files are not provided, they are inferred as: comp_dir/{model}.qfilelist comp_dir/{model}.qnamelist comp_dir/{model}_mode_stats.txt

required
output_dir path - like

Output directory passed to --output.

required
fig_format str

Figure format for output files.

"png"
vis bool

Passed to --vis.

True
custom_cmap str

Passed to --custom_cmap (path to color file or "").

''
bg_colors sequence of str or None

Passed to --bg_colors (list of colors) if given.

None
include_sim_in_label bool

Passed to --include_sim_in_label.

True
ind_labels str

Passed to --ind_labels (path to labels file or "").

''