Overview

IsoGraph is a research software package for gene-aware network analysis of transcript-level RNA-seq data. The project is organized around reproducible datasets, typed configurations, stable command-line entry points, and model backends that can be compared on the same fixture suite.

Documentation Map

Use the README.md for project status, installation, and fast orientation.
Use these reference docs for exact command behavior, data requirements, configuration fields, outputs, and Python APIs.
Use the GitHub Wiki for tutorial-style walkthroughs, especially when preparing and analyzing your own data.

Current Scope

IsoGraph currently includes:

A deterministic baseline backend.
A latent probabilistic backend (sklearn Factor Analysis + LedoitWolf partial correlation) with cross-validated component selection and stability selection support.
A graph-aware backend extending the latent model with graph-Laplacian smoothing.
A VAE backend — the default production backend — with nonlinear latent representation, early stopping, posterior-collapse diagnostics, and optional checkpointing. Requires PyTorch.
A WGCNA backend wrapping R’s blockwiseModules for direct comparison with WGCNA, including blockwise mode for datasets above 5 000 genes.
Synthetic fixture suites: core_v1 (24–800 genes), scale_v1 (6 000–12 000 genes), and multiplex_v1 — a typed abundance + switch suite with planted ground-truth roles (switch_only, abundance_only, coupled, discordant) across four fixtures (40–900 genes) plus an optional xxlarge_multiplex_v1 at 12 000 genes.
A real-data fixture freeze workflow for BrainSeq-style bulk RNA-seq inputs.
Multiplex network inference (Stage 9A/9B) — each gene contributes a log-CPM abundance channel; multi-isoform genes also contribute a CLR-SVD switch channel. A typed feature graph with per-channel edge thresholds and auto-calibrated abundance threshold (alpha_abundance_grid) prevents spurious merging of switch modules. Gene channel role classification (coupled, abundance_only, switch_only, discordant) is reported in module_gene_roles.parquet for every fit.
A module explanation module (isograph.explain) with isograph explain-module and isograph annotate-structure CLI subcommands for transcript-feature-level driver tables, publication-ready plots, VAE decoder attribution (Stage 8D), Captum Integrated Gradients encoder attribution (Stage 8E), and GTF-based structural annotation of switch pairs. Attribution outputs carry feature_type metadata so switch and abundance drivers are distinguished in multiplex fits.
Multiplex stress reports for VAE, graph, latent, and WGCNA backends, including role-aware recall and giant-component diagnostics.