Multiplex and Stress Benchmarks
Isoform switching and gene expression can change independently. A gene that doubles its expression without changing its dominant isoform is an abundance event; a gene that redistributes transcript usage without changing total expression is a switching event. Both types of co-regulation occur in real RNA-seq data, and treating them as a single signal conflates distinct biological mechanisms.
IsoGraph separates these signals by constructing two feature channels per gene: an
abundance channel (log-CPM–standardized gene counts, available for all genes) and a
switch channel (CLR-SVD transcript usage coordinate, available for multi-isoform
genes). A typed feature graph with per-channel edge thresholds and an auto-calibrated
abundance threshold connects genes through whichever channels are active. The multiplex_v1
benchmark suite provides planted ground-truth modules with known role assignments to
validate that IsoGraph recovers each type.
The benchmark runner reports overall planted-module recovery plus role-aware recall for:
switch_only— genes whose module membership is driven by switching, not expressionabundance_only— genes driven by expression level changes onlycoupled— both channels active and positively correlated (r ≥ 0.2)discordant— both channels active but anticorrelated (r < −0.2)
Standard Multiplex Suite
The multiplex_v1 suite contains four regular fixtures:
Fixture |
Genes |
Samples |
Planted modules |
|---|---|---|---|
|
40 |
64 |
2 |
|
320 |
180 |
6 |
|
360 |
110 |
6 |
|
900 |
140 |
10 |
Run the tuned Stage 9 backends:
isograph benchmark --config-name stage9_multiplex_vae
isograph benchmark --config-name stage9_multiplex_graph
isograph benchmark --config-name stage9_multiplex_latent
isograph benchmark --config-name stage9_multiplex_wgcna
Aggregate stress reports against WGCNA:
python scripts/stress_multiplex_summary.py
The summary is written to artifacts/reports/stress-multiplex-backend-summary.json.
Explain Compatibility
Explain modules work on multiplex artifacts. The attribution outputs preserve
feature_id and feature_type so a gene can be explained through its switch channel,
abundance channel, or both. The stress helper evaluates explain accuracy across the
standard multiplex stress artifacts:
python scripts/stress_multiplex_explain.py
The helper writes artifacts/explain/stress_multiplex/stress_multiplex_explain_metrics.json.
Extra-Large Multiplex Stress Fixture
The 12k-gene fixture is intentionally generated only when requested by name so routine test and benchmark runs do not pay its cost.
Fixture |
Genes |
Samples |
Planted modules |
|---|---|---|---|
|
12,000 |
240 |
16 |
Run VAE and WGCNA:
isograph benchmark --config-name stress_multiplex_xxlarge_vae
isograph benchmark --config-name stress_multiplex_xxlarge_wgcna
The current stress reports show:
Backend |
Detected modules |
Recovery |
Runtime seconds |
Notes |
|---|---|---|---|---|
VAE |
16 |
0.9266666666666667 |
7554.699150358792 |
Correct module count; no posterior collapse |
WGCNA |
1898 |
0.9191666666666665 |
877.2644422741141 |
Severe over-segmentation |
VAE role-aware recall on xxlarge_multiplex_v1 was 1.0 for switch_only, 1.0
for coupled, 1.0 for discordant, and 0.725 for abundance_only.
Understanding Role-Aware Recall
Role-aware recall measures the fraction of planted genes in each role that IsoGraph assigns to the correct module. The calibration gates for VAE on the standard multiplex suite are:
Fixture |
Backend |
Overall recovery |
|
|
|---|---|---|---|---|
|
VAE |
≥ 0.40 |
≥ 0.90 |
≥ 0.90 |
|
VAE |
≥ 0.895 |
≥ 0.90 |
≥ 0.90 |
|
VAE |
≥ 0.883 |
≥ 0.90 |
≥ 0.90 |
|
VAE |
≥ 0.896 |
≥ 0.90 |
≥ 0.90 |
Why giant-component fraction matters. A well-calibrated run produces modules that are
clearly separated in the gene graph. When the abundance threshold is too permissive, many
genes connect into one giant component — WGCNA shows this pattern at 12k scale (1898
detected modules vs. 16 planted). Check giant_component_fraction < 0.05 as a sanity
metric; it is reported in the benchmark JSON.
What low recall means. If abundance_only recall is low (< 0.70) on your real data,
the abundance threshold may be merging abundance modules with switch modules. Lower
alpha_abundance or reduce the alpha_abundance_grid range. If switch_only recall is
low, the switch threshold may be too strict; lower alpha_switch.
Artifact Policy
Generated dataset bundles and per-fixture benchmark directories are reproducible and
ignored by git. Commit compact evidence as JSON reports under artifacts/reports/.