isograph.workflow.config

Typed workflow configuration.

class isograph.workflow.config.LatentModelConfig(name='latent_network', n_components=10, n_components_grid=<factory>, n_components_cv_folds=5, alpha=0.1, min_module_size=2, max_iter=1000, tol=0.0001, residualize_covariates=<factory>, trait_columns=<factory>, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
Parameters:
  • name (str)

  • n_components (int)

  • n_components_grid (list[int] | None)

  • n_components_cv_folds (int)

  • alpha (float)

  • min_module_size (int)

  • max_iter (int)

  • tol (float)

  • residualize_covariates (list[str])

  • trait_columns (list[str])

  • allow_abundance_abundance (bool)

  • alpha_switch (float | None)

  • alpha_abundance (float | None)

  • alpha_abundance_grid (list[float] | None)

name: str = 'latent_network'
n_components: int = 10
n_components_grid: list[int] | None
n_components_cv_folds: int = 5
alpha: float = 0.1
min_module_size: int = 2
max_iter: int = 1000
tol: float = 0.0001
residualize_covariates: list[str]
trait_columns: list[str]
allow_abundance_abundance: bool = False
alpha_switch: float | None = None
alpha_abundance: float | None = None
alpha_abundance_grid: list[float] | None = None
class isograph.workflow.config.BaselineModelConfig(name='baseline_network', alpha=0.12, min_module_size=3, max_modules=64, residualize_covariates=<factory>, trait_columns=<factory>, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
Parameters:
  • name (str)

  • alpha (float)

  • min_module_size (int)

  • max_modules (int)

  • residualize_covariates (list[str])

  • trait_columns (list[str])

  • allow_abundance_abundance (bool)

  • alpha_switch (float | None)

  • alpha_abundance (float | None)

  • alpha_abundance_grid (list[float] | None)

name: str = 'baseline_network'
alpha: float = 0.12
min_module_size: int = 3
max_modules: int = 64
residualize_covariates: list[str]
trait_columns: list[str]
allow_abundance_abundance: bool = False
alpha_switch: float | None = None
alpha_abundance: float | None = None
alpha_abundance_grid: list[float] | None = None
class isograph.workflow.config.GraphModelConfig(name='graph_network', n_components=10, n_components_grid=<factory>, n_components_cv_folds=5, alpha=0.1, min_module_size=2, max_iter=1000, tol=0.0001, residualize_covariates=<factory>, trait_columns=<factory>, gamma=0.5, edge_types=<factory>, corr_threshold=0.3, normalized_laplacian=True, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
Parameters:
  • name (str)

  • n_components (int)

  • n_components_grid (list[int] | None)

  • n_components_cv_folds (int)

  • alpha (float)

  • min_module_size (int)

  • max_iter (int)

  • tol (float)

  • residualize_covariates (list[str])

  • trait_columns (list[str])

  • gamma (float)

  • edge_types (list[str])

  • corr_threshold (float)

  • normalized_laplacian (bool)

  • allow_abundance_abundance (bool)

  • alpha_switch (float | None)

  • alpha_abundance (float | None)

  • alpha_abundance_grid (list[float] | None)

name: str = 'graph_network'
n_components: int = 10
n_components_grid: list[int] | None
n_components_cv_folds: int = 5
alpha: float = 0.1
min_module_size: int = 2
max_iter: int = 1000
tol: float = 0.0001
residualize_covariates: list[str]
trait_columns: list[str]
gamma: float = 0.5
edge_types: list[str]
corr_threshold: float = 0.3
normalized_laplacian: bool = True
allow_abundance_abundance: bool = False
alpha_switch: float | None = None
alpha_abundance: float | None = None
alpha_abundance_grid: list[float] | None = None
class isograph.workflow.config.VaeModelConfig(name='vae_network', latent_dim=8, hidden_dim=128, n_hidden_layers=2, latent_dim_grid=None, beta=1.0, n_epochs=500, lr=0.001, weight_decay=1e-05, batch_size=None, warmup_epochs=None, val_fraction=0.2, patience=50, early_stop_tol=0.0001, collapse_threshold=0.01, random_state=0, device=None, alpha=0.7, min_module_size=2, residualize_covariates=<factory>, trait_columns=<factory>, checkpoint_dir=None, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)

Configuration for the VAE network backend.

Choosing hidden_dim (function of BOTH n_genes AND n_samples):

  • n_genes <= 1000 and n_samples >= 150: 128 (default) works well.

  • n_genes <= 1000 and 75 <= n_samples < 150: raise to 192 for high-dispersion data.

  • n_genes <= 1000 and n_samples < 75: 256 reduces underfitting, but expect lower recovery; the model is data-limited.

  • 1000 < n_genes <= 5000: hidden_dim=256–512; consider n_hidden_layers=3.

  • n_genes > 5000 (25:1–50:1 genes-to-samples): hidden_dim=512–1024; n_hidden_layers=3 recommended; batch_size will be auto-set to min(64, n_samples // 4) when left as None.

Do not use a ratio like latent_dim * 8 — stress tests show non-monotonic recovery across hidden_dim values, especially for high-dispersion or small-n data.

Choosing latent_dim / latent_dim_grid:

Set latent_dim_grid to a list (e.g. [2, 4, 6, 8, 12]) to let the model sweep and auto-select the smallest k whose reconstruction RMSE improvement falls below 0.01. This is recommended when you do not know the number of true modules. Leave latent_dim_grid=None to use a fixed latent_dim.

Parameters:
  • name (str)

  • latent_dim (int)

  • hidden_dim (int)

  • n_hidden_layers (int)

  • latent_dim_grid (list[int] | None)

  • beta (float)

  • n_epochs (int)

  • lr (float)

  • weight_decay (float)

  • batch_size (int | None)

  • warmup_epochs (int | None)

  • val_fraction (float)

  • patience (int)

  • early_stop_tol (float)

  • collapse_threshold (float)

  • random_state (int)

  • device (str | None)

  • alpha (float)

  • min_module_size (int)

  • residualize_covariates (list[str])

  • trait_columns (list[str])

  • checkpoint_dir (Path | None)

  • allow_abundance_abundance (bool)

  • alpha_switch (float | None)

  • alpha_abundance (float | None)

  • alpha_abundance_grid (list[float] | None)

name: str = 'vae_network'
latent_dim: int = 8
hidden_dim: int = 128
n_hidden_layers: int = 2
latent_dim_grid: list[int] | None = None
beta: float = 1.0
n_epochs: int = 500
lr: float = 0.001
weight_decay: float = 1e-05
batch_size: int | None = None
warmup_epochs: int | None = None
val_fraction: float = 0.2
patience: int = 50
early_stop_tol: float = 0.0001
collapse_threshold: float = 0.01
random_state: int = 0
device: str | None = None
alpha: float = 0.7
min_module_size: int = 2
residualize_covariates: list[str]
trait_columns: list[str]
checkpoint_dir: Path | None = None
allow_abundance_abundance: bool = False
alpha_switch: float | None = None
alpha_abundance: float | None = None
alpha_abundance_grid: list[float] | None = None
class isograph.workflow.config.WgcnaModelConfig(name='wgcna_network', power=None, power_range=<factory>, sft_r2_threshold=0.85, min_module_size=2, merge_cut_height=0.25, deep_split=2, network_type='signed', random_state=0, timeout_seconds=600, trait_columns=<factory>, residualize_covariates=<factory>)
Parameters:
  • name (str)

  • power (int | None)

  • power_range (list[int])

  • sft_r2_threshold (float)

  • min_module_size (int)

  • merge_cut_height (float)

  • deep_split (int)

  • network_type (str)

  • random_state (int)

  • timeout_seconds (int)

  • trait_columns (list[str])

  • residualize_covariates (list[str])

name: str = 'wgcna_network'
power: int | None = None
power_range: list[int]
sft_r2_threshold: float = 0.85
min_module_size: int = 2
merge_cut_height: float = 0.25
deep_split: int = 2
network_type: str = 'signed'
random_state: int = 0
timeout_seconds: int = 600
trait_columns: list[str]
residualize_covariates: list[str]
class isograph.workflow.config.RealDataFilterTerm(kind, column, df=None, standardize=True)
Parameters:
  • kind (str)

  • column (str)

  • df (int | None)

  • standardize (bool)

kind: str
column: str
df: int | None = None
standardize: bool = True
class isograph.workflow.config.RealDataFreezeConfig(counts_root=PosixPath('data/counts'), annotations_root=PosixPath('data/annotations'), phenotype_tsv=PosixPath('data/phenotypes.tsv'), ancestry_tsv=PosixPath('data/ancestry.txt'), output_name='real_caudate_aa_v1', gene_panel_size=256, allowed_diagnoses=<factory>, cache_root=PosixPath('benchmarks/cache/real_data'), filter_min_count=10.0, filter_min_total_count=15.0, filter_large_n=10.0, filter_min_prop=0.7, filter_design_terms=<factory>)
Parameters:
  • counts_root (Path)

  • annotations_root (Path)

  • phenotype_tsv (Path)

  • ancestry_tsv (Path)

  • output_name (str)

  • gene_panel_size (int | None)

  • allowed_diagnoses (list[str])

  • cache_root (Path)

  • filter_min_count (float)

  • filter_min_total_count (float)

  • filter_large_n (float)

  • filter_min_prop (float)

  • filter_design_terms (list[RealDataFilterTerm])

counts_root: Path = PosixPath('data/counts')
annotations_root: Path = PosixPath('data/annotations')
phenotype_tsv: Path = PosixPath('data/phenotypes.tsv')
ancestry_tsv: Path = PosixPath('data/ancestry.txt')
output_name: str = 'real_caudate_aa_v1'
gene_panel_size: int | None = 256
allowed_diagnoses: list[str]
cache_root: Path = PosixPath('benchmarks/cache/real_data')
filter_min_count: float = 10.0
filter_min_total_count: float = 15.0
filter_large_n: float = 10.0
filter_min_prop: float = 0.7
filter_design_terms: list[RealDataFilterTerm]
class isograph.workflow.config.StabilitySelectionConfig(alpha_grid=<factory>, n_iterations=50, subsample_fraction=0.8, stability_threshold=0.6, seed=0)

Config for stability-selection alpha tuning (for real data without ground truth).

Parameters:
  • alpha_grid (list[float])

  • n_iterations (int)

  • subsample_fraction (float)

  • stability_threshold (float)

  • seed (int)

alpha_grid: list[float]
n_iterations: int = 50
subsample_fraction: float = 0.8
stability_threshold: float = 0.6
seed: int = 0
class isograph.workflow.config.BenchmarkCommandConfig(command='benchmark', dataset_suite='core_v1', stage_name='stage1_baseline', backend='vae', fixture_filter=None, benchmark_root=PosixPath('benchmarks'), artifacts_root=PosixPath('artifacts'), dataset_root=PosixPath('benchmarks/datasets'), report_root=PosixPath('artifacts/reports'), tracking_uri=None, seed=7, real_data=<factory>, model=<factory>, latent=<factory>, graph=<factory>, recovery_thresholds=<factory>, fixture_model_overrides=<factory>, fixture_latent_overrides=<factory>, fixture_graph_overrides=<factory>, vae=<factory>, fixture_vae_overrides=<factory>, wgcna=<factory>, fixture_wgcna_overrides=<factory>, run_stability_selection=False, stability=<factory>)
Parameters:
  • command (str)

  • dataset_suite (str)

  • stage_name (str)

  • backend (str)

  • fixture_filter (str | None)

  • benchmark_root (Path)

  • artifacts_root (Path)

  • dataset_root (Path)

  • report_root (Path)

  • tracking_uri (str | None)

  • seed (int)

  • real_data (RealDataFreezeConfig)

  • model (BaselineModelConfig)

  • latent (LatentModelConfig)

  • graph (GraphModelConfig)

  • recovery_thresholds (dict[str, float])

  • fixture_model_overrides (dict[str, dict])

  • fixture_latent_overrides (dict[str, dict])

  • fixture_graph_overrides (dict[str, dict])

  • vae (VaeModelConfig)

  • fixture_vae_overrides (dict[str, dict])

  • wgcna (WgcnaModelConfig)

  • fixture_wgcna_overrides (dict[str, dict])

  • run_stability_selection (bool)

  • stability (StabilitySelectionConfig)

command: str = 'benchmark'
dataset_suite: str = 'core_v1'
stage_name: str = 'stage1_baseline'
backend: str = 'vae'
fixture_filter: str | None = None
benchmark_root: Path = PosixPath('benchmarks')
artifacts_root: Path = PosixPath('artifacts')
dataset_root: Path = PosixPath('benchmarks/datasets')
report_root: Path = PosixPath('artifacts/reports')
tracking_uri: str | None = None
seed: int = 7
real_data: RealDataFreezeConfig
model: BaselineModelConfig
latent: LatentModelConfig
graph: GraphModelConfig
recovery_thresholds: dict[str, float]
fixture_model_overrides: dict[str, dict]
fixture_latent_overrides: dict[str, dict]
fixture_graph_overrides: dict[str, dict]
vae: VaeModelConfig
fixture_vae_overrides: dict[str, dict]
wgcna: WgcnaModelConfig
fixture_wgcna_overrides: dict[str, dict]
run_stability_selection: bool = False
stability: StabilitySelectionConfig
class isograph.workflow.config.FitCommandConfig(command='fit', benchmark_root=PosixPath('benchmarks'), artifacts_root=PosixPath('artifacts'), dataset_path=None, output_dir=PosixPath('artifacts/fits/manual'), backend='vae', tracking_uri=None, seed=7, model=<factory>, latent=<factory>, graph=<factory>, vae=<factory>, wgcna=<factory>)
Parameters:
command: str = 'fit'
benchmark_root: Path = PosixPath('benchmarks')
artifacts_root: Path = PosixPath('artifacts')
dataset_path: Path | None = None
output_dir: Path = PosixPath('artifacts/fits/manual')
backend: str = 'vae'
tracking_uri: str | None = None
seed: int = 7
model: BaselineModelConfig
latent: LatentModelConfig
graph: GraphModelConfig
vae: VaeModelConfig
wgcna: WgcnaModelConfig
class isograph.workflow.config.CompareCommandConfig(command='compare', reference=None, candidate=None, output_path=PosixPath('artifacts/reports/comparison.json'))
Parameters:
  • command (str)

  • reference (Path | None)

  • candidate (Path | None)

  • output_path (Path)

command: str = 'compare'
reference: Path | None = None
candidate: Path | None = None
output_path: Path = PosixPath('artifacts/reports/comparison.json')
class isograph.workflow.config.ExplainCommandConfig(command='explain-module', artifact_dir=PosixPath('artifacts/fits/manual'), feature_table_path=None, feature_meta_path=None, module_ids=None, output_dir=PosixPath('artifacts/explain'), module_score_table_path=None, split_percentile=50.0, min_complete_pairs=3, fdr_method='bh', transcript_usage_feature_type='transcript_usage')
Parameters:
  • command (str)

  • artifact_dir (Path)

  • feature_table_path (Path | None)

  • feature_meta_path (Path | None)

  • module_ids (list[str] | None)

  • output_dir (Path)

  • module_score_table_path (Path | None)

  • split_percentile (float)

  • min_complete_pairs (int)

  • fdr_method (str)

  • transcript_usage_feature_type (str)

command: str = 'explain-module'
artifact_dir: Path = PosixPath('artifacts/fits/manual')
feature_table_path: Path | None = None
feature_meta_path: Path | None = None
module_ids: list[str] | None = None
output_dir: Path = PosixPath('artifacts/explain')
module_score_table_path: Path | None = None
split_percentile: float = 50.0
min_complete_pairs: int = 3
fdr_method: str = 'bh'
transcript_usage_feature_type: str = 'transcript_usage'