# Outputs and Artifacts

IsoGraph writes most outputs into `artifacts/`, `benchmarks/`, or `snapshots/`.

## Dataset Bundles

A prepared dataset bundle contains:

- `manifest.json`
- a sample metadata parquet file
- one or more feature-table parquet files
- one or more dense matrix `.npz` files
- optional truth tables for synthetic fixtures

## Benchmark Outputs

`benchmark` writes:

- per-fixture artifact directories under `artifacts/benchmarks/<stage_name>/<dataset>/`
- benchmark summary JSON under `artifacts/reports/<stage_name>-benchmark.json`
- runtime and memory JSON under `artifacts/reports/<stage_name>-runtime-memory.json`
- calibration JSON when supported by the selected backend
- stability-selection JSON for real-data fixtures when enabled

Each benchmark artifact directory also receives a `run.log` and a snapshot-like set of
tables written through `save_snapshot`.

For multiplex benchmarks, the benchmark JSON additionally includes `role_recall` with
per-role recall values for `switch_only`, `abundance_only`, `coupled`, and `discordant`
truth genes. WGCNA calibration reports include soft-threshold diagnostics when available.

Generated per-fixture artifact directories and generated dataset bundles are reproducible
and ignored by git in this repository. Durable benchmark evidence should be kept as compact
JSON reports under `artifacts/reports/`.

## Fit Outputs

`fit` writes:

- `modules.parquet` — module assignment per gene (`gene_id`, `module_id`)
- `edges.parquet` — inferred gene-gene edges with weights
- `traits.parquet` — module–trait associations (Pearson r or Welch t, p value, BH q value)
- `feature_scores.parquet` — per-feature switch and abundance scores; the `feature_type`
  column is `"switch"` for isoform-switch features and `"abundance"` for abundance
  features. Single-channel (switch-only) fits populate `"switch"` for all rows.
- `module_gene_roles.parquet` — gene channel role classification for each module gene:
  - `module_id`, `gene_id`
  - `module_role`: one of `coupled`, `abundance_only`, `switch_only`, `discordant`, `inactive`
  - `switch_r` — Pearson r between this gene's switch feature and the module eigengene
  - `abundance_r` — Pearson r between this gene's abundance feature and the module eigengene
  - `switch_abundance_r` — Pearson r between this gene's switch and abundance features
  - `switch_active`, `abundance_active` — boolean flags (|r| ≥ 0.2)
- `fit_config.json` — complete configuration used for this run, including IsoGraph version
  and random seed

## Explain Outputs

`explain-module` writes per module (by default under `artifacts/explain/<study>/`):

- `gene_driver_table.parquet` — genes ranked by |r| with the module eigengene; columns
  include `gene_id`, `r`, `pvalue`, `qvalue`, `n_complete`, `missingness`, `feature_id`,
  `feature_type` (for multiplex: `"switch"` or `"abundance"`)
- `transcript_polarity_table.parquet` — transcript-level correlations with module
  eigengene; includes `switch_strength = max(r) − min(r)` per gene
- `high_vs_low_table.parquet` — Welch test contrasts between high- and low-eigengene
  samples per feature
- `vae_drivers.parquet` (when `--vae-attribution`) — high-confidence VAE decoder
  attribution; columns include `gene_id`, `decoded_response`, `feature_type`
- `ig_attributions.parquet` (when `--integrated-gradients`) — Captum Integrated
  Gradients encoder attribution; columns include `gene_id`, `mean_ig`, `mean_abs_ig`,
  `latent_dim`, `latent_eigengene_r`
- `module_explanation_manifest.json` — index of all output files for this module

## Compare Outputs

`compare` writes a JSON report describing either:

- differences between two snapshot directories, or
- delta metrics between two benchmark summary files

## Export Outputs

`export` writes a JSON summary of a prepared dataset bundle, including sample count, gene
count, and available assays.