isograph.evaluation.selection
Alpha selection via stability selection for real data without ground truth.
Stability selection estimates how reproducibly each gene-gene edge appears when a model is refit on repeated random subsamples of the data. Edges that appear consistently across subsamples are considered stable.
The implementation follows a simple loop:
iterate over a user-supplied
alpha_gridrefit the model on repeated subsamples for each alpha
count how often each edge appears
report the number of stable edges per alpha
IsoGraph reports the coarsest alpha that still yields at least one stable edge
as recommended_alpha. Lower alpha values yield denser networks; higher
values yield sparser ones.
- class isograph.evaluation.selection.StabilityResult(alpha_grid, stable_edge_counts, recommended_alpha, edge_stability)
Results from a stability selection run.
- Parameters:
alpha_grid (list[float])
stable_edge_counts (list[int])
recommended_alpha (float)
edge_stability (dict[float, dict[frozenset, float]])
-
alpha_grid:
list[float]
-
stable_edge_counts:
list[int]
-
recommended_alpha:
float
-
edge_stability:
dict[float,dict[frozenset,float]]
- summary_table()
- Return type:
DataFrame
- isograph.evaluation.selection.stability_selection(model, transcript_counts, transcript_table, sample_table, alpha_grid, gene_counts=None, gene_table=None, n_iterations=50, subsample_fraction=0.8, stability_threshold=0.6, seed=0)
Estimate edge stability across subsamples for each alpha in
alpha_grid.- Parameters:
model (
NetworkModel) – A fitted (or unfitted) NetworkModel instance. The model’s config will be temporarily patched with each alpha from the grid.transcript_counts (
ndarray) – Shape (n_transcripts, n_samples). Raw count matrix.transcript_table (
DataFrame) – Rows describe each transcript. Must containgene_idandtranscript_idcolumns.sample_table (
DataFrame) – One row per sample. Must be aligned with columns of transcript_counts.alpha_grid (
list[float]) – Sorted (ascending) list of partial-correlation threshold values to test.n_iterations (
int) – Number of subsampling rounds per alpha. Higher values give more stable estimates; 50 is sufficient for most datasets.subsample_fraction (
float) – Fraction of samples to draw per round (without replacement).stability_threshold (
float) – Minimum fraction of rounds in which a gene pair must appear as an edge to be counted as a stable edge.seed (
int) – Random seed for reproducibility.gene_counts (ndarray | None)
gene_table (DataFrame | None)
- Returns:
Contains stable-edge counts per alpha and a recommended alpha.
- Return type: