isograph.workflow.config
Typed workflow configuration.
- class isograph.workflow.config.LatentModelConfig(name='latent_network', n_components=10, n_components_grid=<factory>, n_components_cv_folds=5, alpha=0.1, min_module_size=2, max_iter=1000, tol=0.0001, residualize_covariates=<factory>, trait_columns=<factory>, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
- Parameters:
name (str)
n_components (int)
n_components_grid (list[int] | None)
n_components_cv_folds (int)
alpha (float)
min_module_size (int)
max_iter (int)
tol (float)
residualize_covariates (list[str])
trait_columns (list[str])
allow_abundance_abundance (bool)
alpha_switch (float | None)
alpha_abundance (float | None)
alpha_abundance_grid (list[float] | None)
-
name:
str= 'latent_network'
-
n_components:
int= 10
-
n_components_grid:
list[int] |None
-
n_components_cv_folds:
int= 5
-
alpha:
float= 0.1
-
min_module_size:
int= 2
-
max_iter:
int= 1000
-
tol:
float= 0.0001
-
residualize_covariates:
list[str]
-
trait_columns:
list[str]
-
allow_abundance_abundance:
bool= False
-
alpha_switch:
float|None= None
-
alpha_abundance:
float|None= None
-
alpha_abundance_grid:
list[float] |None= None
- class isograph.workflow.config.BaselineModelConfig(name='baseline_network', alpha=0.12, min_module_size=3, max_modules=64, residualize_covariates=<factory>, trait_columns=<factory>, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
- Parameters:
name (str)
alpha (float)
min_module_size (int)
max_modules (int)
residualize_covariates (list[str])
trait_columns (list[str])
allow_abundance_abundance (bool)
alpha_switch (float | None)
alpha_abundance (float | None)
alpha_abundance_grid (list[float] | None)
-
name:
str= 'baseline_network'
-
alpha:
float= 0.12
-
min_module_size:
int= 3
-
max_modules:
int= 64
-
residualize_covariates:
list[str]
-
trait_columns:
list[str]
-
allow_abundance_abundance:
bool= False
-
alpha_switch:
float|None= None
-
alpha_abundance:
float|None= None
-
alpha_abundance_grid:
list[float] |None= None
- class isograph.workflow.config.GraphModelConfig(name='graph_network', n_components=10, n_components_grid=<factory>, n_components_cv_folds=5, alpha=0.1, min_module_size=2, max_iter=1000, tol=0.0001, residualize_covariates=<factory>, trait_columns=<factory>, gamma=0.5, edge_types=<factory>, corr_threshold=0.3, normalized_laplacian=True, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
- Parameters:
name (str)
n_components (int)
n_components_grid (list[int] | None)
n_components_cv_folds (int)
alpha (float)
min_module_size (int)
max_iter (int)
tol (float)
residualize_covariates (list[str])
trait_columns (list[str])
gamma (float)
edge_types (list[str])
corr_threshold (float)
normalized_laplacian (bool)
allow_abundance_abundance (bool)
alpha_switch (float | None)
alpha_abundance (float | None)
alpha_abundance_grid (list[float] | None)
-
name:
str= 'graph_network'
-
n_components:
int= 10
-
n_components_grid:
list[int] |None
-
n_components_cv_folds:
int= 5
-
alpha:
float= 0.1
-
min_module_size:
int= 2
-
max_iter:
int= 1000
-
tol:
float= 0.0001
-
residualize_covariates:
list[str]
-
trait_columns:
list[str]
-
gamma:
float= 0.5
-
edge_types:
list[str]
-
corr_threshold:
float= 0.3
-
normalized_laplacian:
bool= True
-
allow_abundance_abundance:
bool= False
-
alpha_switch:
float|None= None
-
alpha_abundance:
float|None= None
-
alpha_abundance_grid:
list[float] |None= None
- class isograph.workflow.config.VaeModelConfig(name='vae_network', latent_dim=8, hidden_dim=128, n_hidden_layers=2, latent_dim_grid=None, beta=1.0, n_epochs=500, lr=0.001, weight_decay=1e-05, batch_size=None, warmup_epochs=None, val_fraction=0.2, patience=50, early_stop_tol=0.0001, collapse_threshold=0.01, random_state=0, device=None, alpha=0.7, min_module_size=2, residualize_covariates=<factory>, trait_columns=<factory>, checkpoint_dir=None, allow_abundance_abundance=False, alpha_switch=None, alpha_abundance=None, alpha_abundance_grid=None)
Configuration for the VAE network backend.
Choosing hidden_dim (function of BOTH n_genes AND n_samples):
n_genes <= 1000andn_samples >= 150: 128 (default) works well.n_genes <= 1000and75 <= n_samples < 150: raise to 192 for high-dispersion data.n_genes <= 1000andn_samples < 75: 256 reduces underfitting, but expect lower recovery; the model is data-limited.1000 < n_genes <= 5000: hidden_dim=256–512; consider n_hidden_layers=3.n_genes > 5000(25:1–50:1 genes-to-samples): hidden_dim=512–1024; n_hidden_layers=3 recommended; batch_size will be auto-set tomin(64, n_samples // 4)when left as None.
Do not use a ratio like
latent_dim * 8— stress tests show non-monotonic recovery across hidden_dim values, especially for high-dispersion or small-n data.Choosing latent_dim / latent_dim_grid:
Set
latent_dim_gridto a list (e.g.[2, 4, 6, 8, 12]) to let the model sweep and auto-select the smallest k whose reconstruction RMSE improvement falls below 0.01. This is recommended when you do not know the number of true modules. Leavelatent_dim_grid=Noneto use a fixedlatent_dim.- Parameters:
name (str)
latent_dim (int)
hidden_dim (int)
n_hidden_layers (int)
latent_dim_grid (list[int] | None)
beta (float)
n_epochs (int)
lr (float)
weight_decay (float)
batch_size (int | None)
warmup_epochs (int | None)
val_fraction (float)
patience (int)
early_stop_tol (float)
collapse_threshold (float)
random_state (int)
device (str | None)
alpha (float)
min_module_size (int)
residualize_covariates (list[str])
trait_columns (list[str])
checkpoint_dir (Path | None)
allow_abundance_abundance (bool)
alpha_switch (float | None)
alpha_abundance (float | None)
alpha_abundance_grid (list[float] | None)
-
name:
str= 'vae_network'
-
latent_dim:
int= 8
-
latent_dim_grid:
list[int] |None= None
-
beta:
float= 1.0
-
n_epochs:
int= 500
-
lr:
float= 0.001
-
weight_decay:
float= 1e-05
-
batch_size:
int|None= None
-
warmup_epochs:
int|None= None
-
val_fraction:
float= 0.2
-
patience:
int= 50
-
early_stop_tol:
float= 0.0001
-
collapse_threshold:
float= 0.01
-
random_state:
int= 0
-
device:
str|None= None
-
alpha:
float= 0.7
-
min_module_size:
int= 2
-
residualize_covariates:
list[str]
-
trait_columns:
list[str]
-
checkpoint_dir:
Path|None= None
-
allow_abundance_abundance:
bool= False
-
alpha_switch:
float|None= None
-
alpha_abundance:
float|None= None
-
alpha_abundance_grid:
list[float] |None= None
- class isograph.workflow.config.WgcnaModelConfig(name='wgcna_network', power=None, power_range=<factory>, sft_r2_threshold=0.85, min_module_size=2, merge_cut_height=0.25, deep_split=2, network_type='signed', random_state=0, timeout_seconds=600, trait_columns=<factory>, residualize_covariates=<factory>)
- Parameters:
name (str)
power (int | None)
power_range (list[int])
sft_r2_threshold (float)
min_module_size (int)
merge_cut_height (float)
deep_split (int)
network_type (str)
random_state (int)
timeout_seconds (int)
trait_columns (list[str])
residualize_covariates (list[str])
-
name:
str= 'wgcna_network'
-
power:
int|None= None
-
power_range:
list[int]
-
sft_r2_threshold:
float= 0.85
-
min_module_size:
int= 2
-
merge_cut_height:
float= 0.25
-
deep_split:
int= 2
-
network_type:
str= 'signed'
-
random_state:
int= 0
-
timeout_seconds:
int= 600
-
trait_columns:
list[str]
-
residualize_covariates:
list[str]
- class isograph.workflow.config.RealDataFilterTerm(kind, column, df=None, standardize=True)
- Parameters:
kind (str)
column (str)
df (int | None)
standardize (bool)
-
kind:
str
-
column:
str
-
df:
int|None= None
-
standardize:
bool= True
- class isograph.workflow.config.RealDataFreezeConfig(counts_root=PosixPath('data/counts'), annotations_root=PosixPath('data/annotations'), phenotype_tsv=PosixPath('data/phenotypes.tsv'), ancestry_tsv=PosixPath('data/ancestry.txt'), output_name='real_caudate_aa_v1', gene_panel_size=256, allowed_diagnoses=<factory>, cache_root=PosixPath('benchmarks/cache/real_data'), filter_min_count=10.0, filter_min_total_count=15.0, filter_large_n=10.0, filter_min_prop=0.7, filter_design_terms=<factory>)
- Parameters:
counts_root (Path)
annotations_root (Path)
phenotype_tsv (Path)
ancestry_tsv (Path)
output_name (str)
gene_panel_size (int | None)
allowed_diagnoses (list[str])
cache_root (Path)
filter_min_count (float)
filter_min_total_count (float)
filter_large_n (float)
filter_min_prop (float)
filter_design_terms (list[RealDataFilterTerm])
-
counts_root:
Path= PosixPath('data/counts')
-
annotations_root:
Path= PosixPath('data/annotations')
-
phenotype_tsv:
Path= PosixPath('data/phenotypes.tsv')
-
ancestry_tsv:
Path= PosixPath('data/ancestry.txt')
-
output_name:
str= 'real_caudate_aa_v1'
-
gene_panel_size:
int|None= 256
-
allowed_diagnoses:
list[str]
-
cache_root:
Path= PosixPath('benchmarks/cache/real_data')
-
filter_min_count:
float= 10.0
-
filter_min_total_count:
float= 15.0
-
filter_large_n:
float= 10.0
-
filter_min_prop:
float= 0.7
-
filter_design_terms:
list[RealDataFilterTerm]
- class isograph.workflow.config.StabilitySelectionConfig(alpha_grid=<factory>, n_iterations=50, subsample_fraction=0.8, stability_threshold=0.6, seed=0)
Config for stability-selection alpha tuning (for real data without ground truth).
- Parameters:
alpha_grid (list[float])
n_iterations (int)
subsample_fraction (float)
stability_threshold (float)
seed (int)
-
alpha_grid:
list[float]
-
n_iterations:
int= 50
-
subsample_fraction:
float= 0.8
-
stability_threshold:
float= 0.6
-
seed:
int= 0
- class isograph.workflow.config.BenchmarkCommandConfig(command='benchmark', dataset_suite='core_v1', stage_name='stage1_baseline', backend='vae', fixture_filter=None, benchmark_root=PosixPath('benchmarks'), artifacts_root=PosixPath('artifacts'), dataset_root=PosixPath('benchmarks/datasets'), report_root=PosixPath('artifacts/reports'), tracking_uri=None, seed=7, real_data=<factory>, model=<factory>, latent=<factory>, graph=<factory>, recovery_thresholds=<factory>, fixture_model_overrides=<factory>, fixture_latent_overrides=<factory>, fixture_graph_overrides=<factory>, vae=<factory>, fixture_vae_overrides=<factory>, wgcna=<factory>, fixture_wgcna_overrides=<factory>, run_stability_selection=False, stability=<factory>)
- Parameters:
command (str)
dataset_suite (str)
stage_name (str)
backend (str)
fixture_filter (str | None)
benchmark_root (Path)
artifacts_root (Path)
dataset_root (Path)
report_root (Path)
tracking_uri (str | None)
seed (int)
real_data (RealDataFreezeConfig)
model (BaselineModelConfig)
latent (LatentModelConfig)
graph (GraphModelConfig)
recovery_thresholds (dict[str, float])
fixture_model_overrides (dict[str, dict])
fixture_latent_overrides (dict[str, dict])
fixture_graph_overrides (dict[str, dict])
vae (VaeModelConfig)
fixture_vae_overrides (dict[str, dict])
wgcna (WgcnaModelConfig)
fixture_wgcna_overrides (dict[str, dict])
run_stability_selection (bool)
stability (StabilitySelectionConfig)
-
command:
str= 'benchmark'
-
dataset_suite:
str= 'core_v1'
-
stage_name:
str= 'stage1_baseline'
-
backend:
str= 'vae'
-
fixture_filter:
str|None= None
-
benchmark_root:
Path= PosixPath('benchmarks')
-
artifacts_root:
Path= PosixPath('artifacts')
-
dataset_root:
Path= PosixPath('benchmarks/datasets')
-
report_root:
Path= PosixPath('artifacts/reports')
-
tracking_uri:
str|None= None
-
seed:
int= 7
-
real_data:
RealDataFreezeConfig
-
model:
BaselineModelConfig
-
latent:
LatentModelConfig
-
graph:
GraphModelConfig
-
recovery_thresholds:
dict[str,float]
-
fixture_model_overrides:
dict[str,dict]
-
fixture_latent_overrides:
dict[str,dict]
-
fixture_graph_overrides:
dict[str,dict]
-
vae:
VaeModelConfig
-
fixture_vae_overrides:
dict[str,dict]
-
wgcna:
WgcnaModelConfig
-
fixture_wgcna_overrides:
dict[str,dict]
-
run_stability_selection:
bool= False
-
stability:
StabilitySelectionConfig
- class isograph.workflow.config.FitCommandConfig(command='fit', benchmark_root=PosixPath('benchmarks'), artifacts_root=PosixPath('artifacts'), dataset_path=None, output_dir=PosixPath('artifacts/fits/manual'), backend='vae', tracking_uri=None, seed=7, model=<factory>, latent=<factory>, graph=<factory>, vae=<factory>, wgcna=<factory>)
- Parameters:
command (str)
benchmark_root (Path)
artifacts_root (Path)
dataset_path (Path | None)
output_dir (Path)
backend (str)
tracking_uri (str | None)
seed (int)
model (BaselineModelConfig)
latent (LatentModelConfig)
graph (GraphModelConfig)
vae (VaeModelConfig)
wgcna (WgcnaModelConfig)
-
command:
str= 'fit'
-
benchmark_root:
Path= PosixPath('benchmarks')
-
artifacts_root:
Path= PosixPath('artifacts')
-
dataset_path:
Path|None= None
-
output_dir:
Path= PosixPath('artifacts/fits/manual')
-
backend:
str= 'vae'
-
tracking_uri:
str|None= None
-
seed:
int= 7
-
model:
BaselineModelConfig
-
latent:
LatentModelConfig
-
graph:
GraphModelConfig
-
vae:
VaeModelConfig
-
wgcna:
WgcnaModelConfig
- class isograph.workflow.config.CompareCommandConfig(command='compare', reference=None, candidate=None, output_path=PosixPath('artifacts/reports/comparison.json'))
- Parameters:
command (str)
reference (Path | None)
candidate (Path | None)
output_path (Path)
-
command:
str= 'compare'
-
reference:
Path|None= None
-
candidate:
Path|None= None
-
output_path:
Path= PosixPath('artifacts/reports/comparison.json')
- class isograph.workflow.config.ExplainCommandConfig(command='explain-module', artifact_dir=PosixPath('artifacts/fits/manual'), feature_table_path=None, feature_meta_path=None, module_ids=None, output_dir=PosixPath('artifacts/explain'), module_score_table_path=None, split_percentile=50.0, min_complete_pairs=3, fdr_method='bh', transcript_usage_feature_type='transcript_usage')
- Parameters:
command (str)
artifact_dir (Path)
feature_table_path (Path | None)
feature_meta_path (Path | None)
module_ids (list[str] | None)
output_dir (Path)
module_score_table_path (Path | None)
split_percentile (float)
min_complete_pairs (int)
fdr_method (str)
transcript_usage_feature_type (str)
-
command:
str= 'explain-module'
-
artifact_dir:
Path= PosixPath('artifacts/fits/manual')
-
feature_table_path:
Path|None= None
-
feature_meta_path:
Path|None= None
-
module_ids:
list[str] |None= None
-
output_dir:
Path= PosixPath('artifacts/explain')
-
module_score_table_path:
Path|None= None
-
split_percentile:
float= 50.0
-
min_complete_pairs:
int= 3
-
fdr_method:
str= 'bh'
-
transcript_usage_feature_type:
str= 'transcript_usage'