Outputs and Artifacts

IsoGraph writes most outputs into artifacts/, benchmarks/, or snapshots/.

Dataset Bundles

A prepared dataset bundle contains:

  • manifest.json

  • a sample metadata parquet file

  • one or more feature-table parquet files

  • one or more dense matrix .npz files

  • optional truth tables for synthetic fixtures

Benchmark Outputs

benchmark writes:

  • per-fixture artifact directories under artifacts/benchmarks/<stage_name>/<dataset>/

  • benchmark summary JSON under artifacts/reports/<stage_name>-benchmark.json

  • runtime and memory JSON under artifacts/reports/<stage_name>-runtime-memory.json

  • calibration JSON when supported by the selected backend

  • stability-selection JSON for real-data fixtures when enabled

Each benchmark artifact directory also receives a run.log and a snapshot-like set of tables written through save_snapshot.

For multiplex benchmarks, the benchmark JSON additionally includes role_recall with per-role recall values for switch_only, abundance_only, coupled, and discordant truth genes. WGCNA calibration reports include soft-threshold diagnostics when available.

Generated per-fixture artifact directories and generated dataset bundles are reproducible and ignored by git in this repository. Durable benchmark evidence should be kept as compact JSON reports under artifacts/reports/.

Fit Outputs

fit writes:

  • modules.parquet — module assignment per gene (gene_id, module_id)

  • edges.parquet — inferred gene-gene edges with weights

  • traits.parquet — module–trait associations (Pearson r or Welch t, p value, BH q value)

  • feature_scores.parquet — per-feature switch and abundance scores; the feature_type column is "switch" for isoform-switch features and "abundance" for abundance features. Single-channel (switch-only) fits populate "switch" for all rows.

  • module_gene_roles.parquet — gene channel role classification for each module gene:

    • module_id, gene_id

    • module_role: one of coupled, abundance_only, switch_only, discordant, inactive

    • switch_r — Pearson r between this gene’s switch feature and the module eigengene

    • abundance_r — Pearson r between this gene’s abundance feature and the module eigengene

    • switch_abundance_r — Pearson r between this gene’s switch and abundance features

    • switch_active, abundance_active — boolean flags (|r| ≥ 0.2)

  • fit_config.json — complete configuration used for this run, including IsoGraph version and random seed

Explain Outputs

explain-module writes per module (by default under artifacts/explain/<study>/):

  • gene_driver_table.parquet — genes ranked by |r| with the module eigengene; columns include gene_id, r, pvalue, qvalue, n_complete, missingness, feature_id, feature_type (for multiplex: "switch" or "abundance")

  • transcript_polarity_table.parquet — transcript-level correlations with module eigengene; includes switch_strength = max(r) min(r) per gene

  • high_vs_low_table.parquet — Welch test contrasts between high- and low-eigengene samples per feature

  • vae_drivers.parquet (when --vae-attribution) — high-confidence VAE decoder attribution; columns include gene_id, decoded_response, feature_type

  • ig_attributions.parquet (when --integrated-gradients) — Captum Integrated Gradients encoder attribution; columns include gene_id, mean_ig, mean_abs_ig, latent_dim, latent_eigengene_r

  • module_explanation_manifest.json — index of all output files for this module

Compare Outputs

compare writes a JSON report describing either:

  • differences between two snapshot directories, or

  • delta metrics between two benchmark summary files

Export Outputs

export writes a JSON summary of a prepared dataset bundle, including sample count, gene count, and available assays.