CLI Reference
IsoGraph exposes a single CLI entry point:
isograph --help
The current subcommands are benchmark, freeze-real, fit, compare, export,
explain-module, and annotate-structure.
Overrides
benchmark, freeze-real, and fit accept Hydra-style overrides after --:
isograph benchmark -- backend=latent fixture_filter=medium_v1 stage_name=stage2_docs
isograph fit --dataset-path my_cohort --backend vae -- vae.alpha=0.6 vae.hidden_dim=256
benchmark
Run the bundled fixture suite, or a filtered subset, through a selected backend.
The default backend is vae. Available backends: baseline, latent, graph, vae,
wgcna.
Examples:
# VAE on a single fixture (default backend)
isograph benchmark -- fixture_filter=toy_v1 stage_name=vae_toy
# WGCNA on the scale suite
isograph benchmark --config-name stage6_scale_comparison_wgcna
Behavior:
For
dataset_suite: core_v1— generates the syntheticcore_v1fixtures as needed and freezes the real fixture unlessfixture_filtertargets only synthetic datasets.For
dataset_suite: multiplex_v1— generates abundance-aware toy, medium, noisy, and large fixtures with explicittruth_switch,truth_abundance, andtruth_channel_roletables.For
dataset_suite: scale_v1— generatesxlarge_v1(6 000 genes),xxlarge_v1(12 000 genes), andxxlarge_stress_v1(12 000 genes, stressed parameters).Writes per-fixture artifacts under
artifacts/benchmarks/<stage_name>/.Writes benchmark and runtime summaries under
artifacts/reports/.Writes calibration reports when the selected backend emits calibration metadata.
freeze-real
Freeze the bundled real-data fixture from local source tables.
Example:
isograph freeze-real --suite-name core_v1
The command reads BenchmarkCommandConfig.real_data through the benchmark config and
caches intermediate selections under benchmarks/cache/real_data/.
fit
Fit any backend on a prepared dataset bundle. VAE is the default.
# VAE (default)
isograph fit \
--dataset-path benchmarks/datasets/custom/my_cohort_v1 \
--output-dir artifacts/fits/vae_default
# Baseline
isograph fit \
--dataset-path benchmarks/datasets/core_v1/toy_v1 \
--backend baseline \
--output-dir artifacts/fits/toy_v1
# VAE with Hydra overrides
isograph fit \
--dataset-path benchmarks/datasets/custom/my_cohort_v1 \
--backend vae \
--output-dir artifacts/fits/vae_tuned \
-- vae.alpha=0.6 vae.hidden_dim=256 vae.n_epochs=400
Available backends: baseline, latent, graph, vae, wgcna.
Outputs:
modules.parquetedges.parquettraits.parquetfeature_scores.parquetcalibration.json(when the backend emits calibration metadata — VAE, latent)fit_config.json
Default config values for all backends live in configs/fit.yaml and can be
overridden with Hydra syntax after --.
explain-module
Explain one or more fitted modules at transcript-feature resolution.
isograph explain-module \
--artifact-dir artifacts/fits/my_dataset \
--feature-table features.parquet \
--feature-meta feature_metadata.parquet \
--module-ids M000 M001 \
--output-dir artifacts/explain/run1
With plots and VAE decoder attribution:
isograph explain-module \
--artifact-dir artifacts/fits/my_dataset \
--feature-table features.parquet \
--feature-meta feature_metadata.parquet \
--plot --output-format png pdf \
--vae-attribution \
--output-dir artifacts/explain/run1
With Captum Integrated Gradients (requires pip install isograph[torch-explain]):
isograph explain-module \
--artifact-dir artifacts/fits/my_dataset \
--feature-table features.parquet \
--feature-meta feature_metadata.parquet \
--integrated-gradients --ig-n-steps 100 \
--output-dir artifacts/explain/run1
With a structural annotation table (from annotate-structure):
isograph explain-module \
--artifact-dir artifacts/fits/my_dataset \
--feature-table features.parquet \
--feature-meta feature_metadata.parquet \
--annotation-table transcript_structure_annotations.tsv \
--output-dir artifacts/explain/run1
Inputs:
--artifact-dirmust containmodules.parquetandfeature_scores.parquet.--feature-table: Parquet with sample IDs as index (orsample_idcolumn), feature IDs as columns.--feature-meta: Parquet or TSV with columnsfeature_id,gene_id,feature_type(required);gene_name,transcript_id,exon_id,event_id(optional).
Outputs per module in {output_dir}/{module_id}/:
gene_driver_table.parquet— gene-level drivers sorted by |r|transcript_polarity_table.parquet— transcript-level correlations withswitch_strengthhigh_vs_low_table.parquet— mean usage contrast between high- and low-module samplesvae_drivers.parquet— high-confidence VAE decoder attribution (with--vae-attribution)ig_attributions.parquet— per-feature IG scores (with--integrated-gradients)Plot files (
*.png,*.pdf) when--plotis given
Shared manifest: {output_dir}/module_explanation_manifest.json
Key flags:
Flag |
Default |
Description |
|---|---|---|
|
all |
Module IDs to explain |
|
off |
Write publication-ready plot files |
|
png |
|
|
none |
Structural annotation TSV from |
|
off |
VAE decoder Jacobian attribution (needs checkpoint) |
|
0.05 |
FDR cutoff for high-confidence VAE drivers |
|
90.0 |
|
|
off |
Captum IG encoder attribution (needs checkpoint + captum) |
|
50 |
IG interpolation steps |
|
zero |
IG baseline: |
annotate-structure
Annotate transcript switch pairs with structural labels from a GTF file.
isograph annotate-structure \
--gtf gencode.v47.annotation.gtf.gz \
--switch-pairs switch_pairs.tsv \
--output transcript_structure_annotations.tsv
Cache the parsed GTF to avoid re-parsing on repeated runs (raw GENCODE v47 ≈ 20 min on NFS; cached ≈ seconds):
isograph annotate-structure \
--gtf gencode.v47.annotation.gtf.gz \
--switch-pairs switch_pairs.tsv \
--gtf-cache gencode_v47_cache.parquet \
--output transcript_structure_annotations.tsv
Inputs:
--gtf: GTF or GTF.gz annotation file (GENCODE or Ensembl conventions supported).--switch-pairs: TSV with columnsgene_id,transcript_id_1,transcript_id_2.
Output (--output): TSV with structural labels per switch pair:
Label |
Type |
Description |
|---|---|---|
|
bool |
First exon differs between transcripts |
|
bool |
Last exon differs |
|
bool |
Internal exon composition differs |
|
bool |
CDS coordinates differ |
|
bool |
UTR coordinates differ |
|
bool |
Transcript biotype differs |
|
bool |
One transcript is coding, the other is not |
|
float |
Transcript length difference (bp) |
|
float |
CDS length difference (bp) |
|
float |
Fraction of exons shared between the two transcripts |
Pass the output to isograph explain-module --annotation-table to merge these labels
into the driver tables.
compare
Compare either two snapshot directories or two benchmark JSON reports.
Examples:
isograph compare \
--reference snapshots/stage0_toy_v1_baseline_v1_seed0000 \
--candidate artifacts/benchmarks/quickstart_baseline/toy_v1
isograph compare \
--reference artifacts/reports/stage2_latent-benchmark.json \
--candidate artifacts/reports/stage4_vae-benchmark.json
export
Write a JSON summary of a prepared dataset bundle.
Example:
isograph export \
--dataset-path benchmarks/datasets/core_v1/toy_v1 \
--output-path artifacts/reports/toy_v1-summary.json