Draft - DEGORD - RR by MaartenMarsman · Pull Request #109 · Bayesian-Graphical-Modelling-Lab/bgms

MaartenMarsman · 2026-05-19T11:32:47Z

...

Stage 3 Phase 1a of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Ports the closed-form log Z(G) approximations from the z-project (Route 3a + DEGORD pilot) into bgms as numerics primitives, ready for the hierarchical-spec MH wiring in Phases 3-4. - src/models/ggm/log_z_nlo.{h,cpp}: - log_Z_NLO_gamma(G, alpha, beta, sigma, include_F, delta): general-alpha Laplace + NLO closed form for the spikeslab-+tilt GGM normaliser. - log_Z_NLO_gamma_degord(G, i, j, ...): toggle-endpoint reordering wrapper (permutes (i, j) to positions (0, 1)). - log_Z_NLO_gamma_delta_incr_alpha1(G_before, i, j, ...): O(deg^2 + deg*q) incremental log-Z difference under a single-edge toggle at alpha = 1. - src/log_z_test_interface.cpp: Rcpp exports for unit testing. - tests/testthat/test-log-z-nlo.R: 437 assertions covering parity against the z reference (bit-exact, max abs diff = 0 across 108 grid cells), alpha = 1 incremental vs full-recompute (machine-precision), DEGORD permutation consistency, and analytic empty-graph constant. - tests/testthat/fixtures/log_z_nlo_reference.rds: z-reference fixture (regeneratable via dev/numerical_analyses/generate_log_z_fixture.R).

codecov · 2026-05-19T11:46:15Z

Codecov Report

❌ Patch coverage is 74.86157% with 227 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.30%. Comparing base (d1fe752) to head (321c1a8).

Files with missing lines	Patch %	Lines
src/savage_dickey_test_interface.cpp	34.39%	103 Missing ⚠️
src/models/ggm/ggm_model.cpp	85.54%	48 Missing ⚠️
src/math/savage_dickey/cubic_mode.cpp	81.88%	25 Missing ⚠️
R/bgm_spec.R	60.46%	17 Missing ⚠️
R/bgm.R	12.50%	14 Missing ⚠️
R/sample_ggm_prior.R	37.50%	5 Missing ⚠️
src/math/cholesky_helpers.cpp	89.18%	4 Missing ⚠️
src/models/ggm/ggm_model.h	76.47%	4 Missing ⚠️
R/build_output.R	0.00%	2 Missing ⚠️
src/sample_ggm.cpp	77.77%	2 Missing ⚠️
... and 3 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #109      +/-   ##
==========================================
- Coverage   86.62%   85.30%   -1.33%     
==========================================
  Files          77       82       +5     
  Lines       13076    13919     +843     
==========================================
+ Hits        11327    11873     +546     
- Misses       1749     2046     +297

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Stage 3 Phase 1b of dev/plans/backlog/hierarchical-ggm-degord-rr.md. At alpha > 1, H_e gains a -4 beta (alpha - 1) / M_v[v] piece that depends on the larger endpoint, so a single-edge toggle (i, j) cascades H_e changes to every edge incident to i (forward AND backward). The alpha = 1 locality decomposition (vertex_i_contribs only) no longer suffices. New approach: partition log_Z_NLO_gamma into "instances touching V_a = {i, j} ∪ N_G_before(i)" vs "instances not touching V_a", with the latter invariant under the toggle and cancelling in the difference. - src/models/ggm/log_z_nlo.cpp: - partial_log_Z_NLO_gamma_on_V (private): full-recompute structure with an any-index-in-V_a filter on every term type (log_C0 per-edge and per-vertex, log_LO, B_e, C_{e,k}, D_v, E_{lm}+F, N_v, M_e). M_v and H_e are still computed on the full graph; the filter only decides inclusion in the partial sum. - log_Z_NLO_gamma_delta_incr_alphaN (public): partial(G_after, V_a) - partial(G_before, V_a). Reduces to log_Z_NLO_gamma_delta_incr_alpha1 at alpha = 1 to machine precision. - tests/testthat/test-log-z-nlo.R: 816 new assertions across q in {3, 5, 7}, alpha in {1.5, 2, 3}, delta in {0, 0.5, 1}, include_F in {FALSE, TRUE}, all toggles. Two new test_that blocks: - alphaN vs full-recompute difference at alpha > 1 (within 1e-10). - alphaN(alpha = 1) vs the existing alpha = 1 incremental (within 1e-10). All 1253 assertions in the file pass at machine precision. Cost is O(q^2) per toggle: the outer loops are still over all edges and non-edges (filter, not loop-bound, restricts inclusion). The plan's O(deg^2 + deg * q) optimisation is a future loop-bound restructuring that can be tested against this implementation.

Replaces the O(q^2) any-index-in-V_a filter with an O(deg^2 + deg * q) canonical-representative enumeration. Each qualifying instance (edge, triple, non-edge, vertex term) is now walked exactly once via its lowest-index V_a member, eliminating the O(q^2) and O(q^3) scans on the heavy term types. Strategy: - Precompute forward/backward adjacency lists in O(|E|), reusing them for nu, H_e, T, and S sums. - Edge-indexed terms (log_LO, B_e, M_e): for each p in V_a, walk fwd[p] and bwd[p] (with !in_V[l] gate for the larger-endpoint case). - Vertex-indexed terms (lgamma, D_v, N_v): direct loop over V_a_idx. - Triple-indexed C_{(a, b), k}: three cases for canon = p, k=p, a=p, b=p with gates k not in V_a (case 2) and a, k not in V_a (case 3). - Non-edge E_{lm}: for each p in V_a, walk (p, m) and (l, p) pairs with the !in_V[l] gate for the larger-endpoint case; common-pred sum walks bwd[l] and tests G(k, m). Measured speedup over the full-recompute difference baseline on a sparse 20%-density graph (100 random toggles, alpha=2, delta=0.5): - q= 20: incremental ~ full-diff (parity; constant-factor regime). - q= 50: 1.2x. - q=100: 2.9x. All 1253 assertions in tests/testthat/test-log-z-nlo.R continue to pass at machine precision (Phase 1a fixture, Phase 1b parity, alpha=1 reduction).

…e 2) Stage 3 Phase 2 of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Ports the inner Zhat machinery from ~/SV/Z/R/src/degord_sampler.h (v4, 2026-05-18) into bgms, hooked to SafeRNG. New files --------- - src/models/ggm/degord_sampler.{h,cpp}: - ChainAux (per-chain (alpha, beta, sigma, delta) constants + per-nu transcendental tables). - PiAux (per-permutation: G_pi, nu_pi, E_count, log_C0). - permute_graph / degord_permutation (toggle endpoints to (q-2, q-1)). - phi_pi_sample_from_noise (inner Bartlett-Cholesky kernel). - log_Zhat_pi_from_pool (main entry; log-sum-exp over importance weights). - log_Zhat_pi_from_pool_cache + PoolCache (cached variant for delta-toggle reuse). - row_qm2_logw_from_S (single-row recompute under trailing toggle). - delta_log_Zhat_pi_toggle (efficient delta with cache reuse). - draw_bartlett_pool (SafeRNG-based standard-normal pool). Two bug fixes folded in vs the z-project v4 reference ----------------------------------------------------- Both fixes were derived during the bgms port audit and confirmed correct on the z side; they are bit-verified and land here with regression assertions: 1. Cache rw_head fix (Option 2). The v4 rw_head spanned only rows 0..q-3, so the star aggregation in delta_log_Zhat_pi_toggle omitted row q-1's diagonal log-weight even though it is invariant across (curr, star) for any toggle (nu_pi[q-1] = 0 always under the DEGORD permutation). The omission left a sample-shifted bias that grew with the tilt parameter delta. Fix: rw_head extended to include row_logw[q - 1]. 2. z_trail slot fix (slab_tilt_mode == 1). delta_log_Zhat_pi_toggle passed z_trail = 0.0 hardcoded to row_qm2_logw_from_S, dropping the saddle-shifted slab innovation noise[q + edge_offset(q-2, q-1)] = noise[q + (q-2)(q+1)/2] when slab_tilt_mode == 1. Fix: read the actual slab slot via noise_pool.colptr(slab_idx). Tests ----- - tests/testthat/test-degord-sampler.R: 1526 assertions covering: - ChainAux nu-tables vs the z reference (local-only, gated on z source). - log_Zhat_pi_from_pool bit-parity on shared noise pool via the fixture at tests/testthat/fixtures/degord_sampler_reference.rds (regeneratable via dev/numerical_analyses/generate_degord_fixture.R). - DEGORD permutation correctness (i, j) -> (q-2, q-1). - delta_log_Zhat_pi_toggle EQUALS the direct full-recompute difference log_Zhat(star) - log_Zhat(curr) at machine precision under BOTH slab_tilt_modes. The (q=10, delta=0, tilt=1, toggle=(3,9)) cell is an explicit regression row. - delta_log_Zhat_pi_toggle bit-parity vs the z reference (local-only). - var(log_Zhat) scales as 1/M_inner (Phase 2 acceptance). - draw_bartlett_pool shape, normality, seed determinism. Acceptance criteria from the Phase 2 plan section are met: - log_Zhat_pi_from_pool agrees with the z reference at bit-parity on shared input (max abs diff = 0 across the 108-cell fixture grid). - var(log Zhat) scales as 1/M_inner (M*var constant at ~0.35 across M in {30, 100, 300, 1000}; ratio var(30)/var(1000) = 36.3 vs theoretical 33.3). bgms-side Phase 1 work (log_Z_NLO_gamma + alpha=1 incremental + general-alpha incremental + V_a-pivot optimisation) is unaffected and its 1253 assertions still pass.

Stage 3 Phase 3 of dev/plans/backlog/hierarchical-ggm-degord-rr.md. The per-toggle unbiased 1/Z(Γ) estimator that composes the Phase 2 DEGORD Bartlett-Cholesky sampler into the Lyne-style Russian-Roulette geometric series for the hierarchical-spec MH ratio. New files --------- - src/models/ggm/z_ratio_estimator.{h,cpp}: - V_at_Gamma_pi_degord(K_depth, pools_t, pi_aux, chain_aux, c, rho): V = (1/c) · [1 + sum_{n=1..K} (-1)^n · prod_{i=1..n} (Zhat_i - c)/c / rho^n] with Zhat_i = exp(log_Zhat_pi_from_pool(pools_t[i-1], pi_aux, chain_aux)). Returns the signed V (caller tracks sign for ergodic averaging). - draw_U_degord_rr(rng, K_depth, pools_t, M_inner, q, rho): K_depth ~ Geom(1 - rho) via inverse-CDF on SafeRNG uniform; pools_t[n] is (dim x M_inner) iid N(0, 1) in the pre-transposed layout required by phi_pi_sample_from_noise. - ZRatioState (declaration only): the per-Γ state the Phase 4 chain will own (pools_t, K_depth, kappa, rho, log_Z_NLO_curr, sign_curr). - Rcpp test exports in src/log_z_test_interface.cpp: - degord_V_at_Gamma_pi_cpp (List interface for pools_t). - degord_draw_U_rr_cpp returns List(K_depth, pools_t). Tests ----- - tests/testthat/test-z-ratio-estimator.R: 10 assertions covering: - V is unbiased for 1/Z within MC noise at q=5: mean(V) across 1000 independent (K_depth, pools) draws sits within 4 SD of 1/Z computed via a high-precision (M=5000, n=20) log_Zhat truth proxy. - K_depth ~ Geom(1 - rho): mean, P(K=0), P(K>=5) all match within n=10000 MC tolerance. - V at K_depth = 0 equals 1/c exactly (machine precision). - draw_U_degord_rr is deterministic in the SafeRNG seed. - Signed V picks up the alternating series when c is far from 1/Z (small c forces negative samples at K_depth >= 1). Cited references ---------------- - Direct port of ~/SV/Z/R/src/branchB_chain_route3a_degord.cpp:192-228 for V_at_Gamma_pi_degord and :215-228 for the pool draw. - Lyne (2015) Russian-Roulette construction; route3a_V_helpers notes/exactness-proposition.md §2 for radius/moment conditions.

…er_pair (Phase 4a) Stage 3 Phase 4a of dev/plans/backlog/hierarchical-ggm-degord-rr.md. Wires the Phase 2 DEGORD sampler + Phase 3 V/RR estimator into bgms's GGM between-edge MH path. Converts the joint-spec MH ratio into the hierarchical-spec ratio by multiplying by V(Γ_curr)/V(Γ_star), an unbiased estimator of Z(Γ_star)/Z(Γ_curr). This is the C++ infrastructure half of Phase 4. The R API surface (bgm(..., graph_prior_spec = "hierarchical")) is Phase 4b, a separate plumbing job. Changes ------- - src/models/ggm/ggm_model.h - New enum class GraphPriorSpec { Joint, Hierarchical } (header-level, outside the class). - GGMModel gains: void set_graph_prior_spec(GraphPriorSpec); void set_z_ratio_tuning(int M_inner, double kappa, double rho); plus private members graph_prior_spec_, v_M_inner_, v_kappa_, v_rho_, log_Z_NLO_curr_, v_K_depth_, v_pools_t_, chain_aux_degord_, and cached prior-family parameters (prior_sigma_, prior_alpha_, prior_beta_). - src/models/ggm/ggm_model.cpp - ensure_hierarchical_state_(): lazy init. Validates the slab is NormalPrior and the diagonal is GammaScalePrior (throws otherwise). Builds chain_aux_degord_; full-recomputes log_Z_NLO_curr_ at the current Γ; draws the first U-pool. - refresh_z_ratio_pool_(): fresh (K_depth, pools_t) draw per iteration, mirroring the z chain's refresh_U_every = 1 convention. - prepare_iteration() now calls ensure + refresh when in Hierarchical mode. - update_edge_indicator_parameter_pair (both branches): after the existing joint MH ratio is assembled, if Hierarchical mode is active, compute log_Z_NLO_star (incremental at alpha=1, full recompute at alpha != 1), build PiAux at the DEGORD permutation sending (i, j) to (q-2, q-1), compute V_curr and V_star, and add log|V_curr|/|V_star| to ln_alpha. Auto-reject on non-finite or zero |V| (Lyne 2015 convention). On accept, log_Z_NLO_curr_ <- log_Z_NLO_star. - Defensive note retained that the determinant_tilt_ * log_det_ratio_edge term is identically 0 under Roverato slaving (verified during the Z-side audit and reconfirmed by direct calculation here). - src/priors/parameter_prior.h - Adds public scale()/shape()/rate() accessors on NormalPrior and GammaScalePrior so the hierarchical state can read the prior params without coupling to the construction site. - src/log_z_test_interface.cpp - New ggm_hierarchical_smoke_cpp entry: constructs a GGMModel directly, switches it to Hierarchical, runs n_sweeps Metropolis edge-update sweeps, returns final_edges + n_edges_path. - tests/testthat/test-ggm-hierarchical.R - 13 assertions across 3 test_that blocks: 1. Chain runs without crash and edge counts stay in [0, p(p-1)/2] across 200 sweeps at q=5, n=100 random data. 2. Output is deterministic in the seed (same seed -> identical output; different seed -> different trajectory). 3. Chain remains non-degenerate across delta in {0, 0.5, 1.0} (no all-zero or all-full lock-in). Existing Phase 1-3 testthat assertions (1253 + 1526 + 10) continue to pass. The default Joint mode (existing chain semantics) is unchanged. Acceptance criteria from the Phase 4 plan section met for the C++-only half: - bgm(... graph_prior_spec = "hierarchical") will finish on small synthetic data once Phase 4b wires the R-side argument. - Per-sweep wall: q=5 200-sweep run completes in well under a second on M5 Pro at M_inner=100. Full benchmarking deferred to Phase 5. Phase 5 (SBC validation vs AKM-J truth across the delta-sweep) and Phase 4b (R API surface) are the remaining pieces of Stage 3.

… 4b) Stage 3 Phase 4b: exposes the Phase 4a hierarchical-spec inference path through the bgm() R interface. R API ----- - bgm() gains two new arguments: graph_prior_spec = c("joint", "hierarchical") # default "joint" z_ratio_tuning = list(M_inner = 100L, kappa = 1.0, rho = 0.5) - Both arguments are plumbed through bgm_spec() -> build_spec_ggm() -> run_sampler_ggm() -> sample_ggm() -> GGMModel::set_graph_prior_spec / set_z_ratio_tuning. - Validation in bgm_spec(): - graph_prior_spec must be "joint" or "hierarchical". - Hierarchical requires model_type in {"ggm", "mixed_mrf"} (continuous precision block needed). Errors helpfully if mismatch. - Hierarchical requires interaction_prior_type == "normal" and scale_prior_type == "gamma" (the prior families for which the closed-form Laplace-NLO normaliser approximation is implemented). Validated up front so the user gets a clean error before MCMC starts. - z_ratio_tuning components validated: M_inner positive integer, kappa > 0, rho in (0, 1). C++ glue -------- - sample_ggm.cpp: adds graph_prior_spec / z_ratio_M_inner / z_ratio_kappa / z_ratio_rho function args, calls set_z_ratio_tuning + set_graph_prior_spec when "hierarchical". - GGMModel copy constructor (header) now also copies graph_prior_spec_ + v_M_inner_ + v_kappa_ + v_rho_ so multi-chain runs propagate the setting. Lazy state (pools, log_Z_NLO_curr, chain_aux_degord) is intentionally reset on clone - pools are RNG-derived and would otherwise share random draws across chains. Roxygen ------- - bgm.R gets full @param entries for graph_prior_spec and z_ratio_tuning. man/bgm.Rd regenerated via devtools::document(). Tests ----- tests/testthat/test-ggm-hierarchical.R adds 10 new R-API assertions: - bgm() with graph_prior_spec = "hierarchical" returns a posterior inclusion matrix with valid shape and values in [0, 1]. - Cauchy slab + hierarchical errors with a helpful message naming the required interaction_prior_type. - Pure ordinal data + hierarchical errors naming "continuous". - z_ratio_tuning rejects M_inner = 0, kappa < 0, rho in {0, 1}. All 23 assertions (13 C++ smoke + 10 R-API) pass. Phase 1-3 tests (1253 + 1526 + 10) continue to pass; default Joint path is unchanged. What is left in Stage 3 ----------------------- - Phase 5: SBC validation against AKM-J truth across the delta-sweep at q = 10 (the manuscript Table 5 cells, post the Z-side disable_log_r fix). - Phase 6: vignette section + cross-references in the spikeslab companion.

Two CI failures on PR #109: 1. tests/testthat/fixtures/{log_z_nlo_reference,degord_sampler_reference}.rds were absent from R CMD check because .Rbuildignore had the blanket pattern `^tests/testthat/fixtures$`, which the original author had intended to exclude only the legacy-fixture subdirectory but which in fact swallows every fixture below it. Tighten the pattern to `^tests/testthat/fixtures/legacy$` so the live fixtures ship; verified via R CMD build that the resulting tarball contains `tests/testthat/fixtures/degord_sampler_reference.rds` and `tests/testthat/fixtures/log_z_nlo_reference.rds` while continuing to exclude the `legacy/` subdirectory. 2. lintr semicolon_linter warnings on `; var <- val` compound statements in three test files. Rewrite each compound-on-one-line as newline-separated statements: - tests/testthat/test-degord-sampler.R (4 sites) - tests/testthat/test-ggm-hierarchical.R (3 sites) - tests/testthat/test-z-ratio-estimator.R (3 sites) All 4 test files re-run clean locally: 1253 + 1526 + 10 + 23 = 2812 assertions pass. `lintr::lint_package()` reports 0 lints.

…(Phase 5) Stage 3 Phase 5: simulation-based calibration of the bgms hierarchical- spec chain at p = 5 across delta in {0, 0.5, 1, 2}, R = 300 replications. Scope decision (q = 5 vs q = 10) -------------------------------- The methodology paper validates at q = 10 against AKM-J perfect-sampler truth, but AKM-J hits its max_proposals cap on roughly 23% of delta = 2 draws (selection bias on hard high-tilt cases that the unbiased Lyne RR chain is designed to handle correctly). Restricting bgms's SBC harness to q = 5 sidesteps this: at q = 5, sample_ggm_prior()'s constrained NUTS at fixed Gamma_true converges cleanly across the full delta sweep, so the truth source is trustworthy. Harness ------- Per replicate at each delta: 1. Gamma_true ~ Bern(0.5) on the upper triangle. 2. K_true | Gamma_true via sample_ggm_prior(n_warmup = 500, n_samples = 1). Rebuild K_true using offdiag_names (row-major) rather than upper.tri() (column-major) so the off-diagonal entries land at the right (i, j); without this the matrix is non-PD. 3. Y | K_true ~ N(0, K_true^{-1}), n_obs = 100. 4. bgm(Y, ..., graph_prior_spec = "hierarchical"), iter = 400, warmup = 100. 5. Rank K_ii_true within posterior raw_samples$main[[1]][, i]. Both sides are on the same K_yy partial-association scale; verified empirically (mean K_diag_true matches mean main posterior at near- empty graphs, low delta). Tests ----- - KS uniformity test per K_ii per delta (5 x 4 = 20 assertions), pass at alpha = 0.01. - Gamma marginal calibration: posterior mean inclusion across the 300 R reps tracks empirical Gamma_true frequency within 4 SE (binomial gap at R * p(p-1)/2 = 3000 draws). One assertion per delta (4 total). Total: 24 assertions, env-gated behind BGMS_RUN_SLOW_TESTS = true. Runtime ------- Smoke run at R = 30 (delta = 0.5) takes ~12 s; full R = 300 across delta in {0, 0.5, 1, 2} extrapolates to ~8 min on M5 Pro. Verified ranks uniform at R = 30 with KS p in [0.11, 0.84] across all 5 K_ii. Why edge-indicator ranks aren't tested directly ----------------------------------------------- Spike-and-slab makes per-(i, j) gamma_ij a discrete {0, 1} draw, which breaks standard SBC's continuous-rank protocol (tied ranks degrade the KS test's calibration). The existing test-sbc-ggm.R sidesteps this by running WITHOUT edge selection. Here we run WITH edge selection (it's the load-bearing case for hierarchical-spec) but only rank-test K_ii. The gamma marginal calibration check is a sanity assertion, not strict SBC.

Two compounding fixes to the hierarchical-spec V(Γ, U) estimator. F5 - log-space V. The linear V_at_Gamma_pi_degord materialises c = κ·exp(log_Z_NLO) which underflows to 0 at large p (log_Z_NLO ≈ -3500 nats at p=100, δ=1), turning every MH proposal into a silent auto-reject. V_log_at_Gamma_pi_degord factors V = (1/c) · S and computes log|S| via signed log-sum-exp over the K_depth + 1 truncated-series terms. log_κ cancels in the MH ratio; the ADD/DELETE branches of update_edge_indicator_parameter_pair now pass log_c per Γ and auto-reject on non-finite log|V| or sign flip across Γ_curr/Γ_star (Phase-1 stand-in until a Lyne sign accumulator lands). F6 - within-toggle cache reuse. log_Zhat_star_from_cache mirrors delta_log_Zhat_pi_toggle's per-sample loop but exposes log_Ẑ_star rather than the delta. V_log_pair_at_Gamma_curr_star_degord runs one cached Phi build under a_curr per pool and re-evaluates only row q-2 under a_star, halving the inner Phi rebuild cost. Both ADD/DELETE branches now invoke the paired call. Tests: - test-z-ratio-estimator.R: log-vs-linear bit-equality at q ∈ {5, 10, 20} (max log_abs gap < 1e-9); underflow regression at log_c = -3500; paired-vs-independent bit-equality across q ∈ {5, 10, 20}, K_depth ∈ {0, 2, 5}; cache adapter matches a fresh log_Zhat_pi_from_pool. - test-ggm-hierarchical.R: p=20 cell runs a full bgm() at δ=1, hierarchical-spec; asserts finite indicators in [0,1] and a moving chain. 3018 assertions across degord-sampler / log-z-nlo / z-ratio / ggm-hierarchical green; full suite (9736 PASS) also green.

Phase 4a smoke validated AMH + hierarchical only. Mirror cell at update_method = "nuts", same p / iter / warmup / prior, confirms the 2x2 cross-product (NUTS/AMH × joint/hierarchical) is genuinely a clean cross-product: the V(Γ, U) hook lives entirely in the between-edge MH update path, so NUTS only governs the within-model continuous block. No R-layer guard was needed. Asserts finite indicators in [0, 1], correct shape, and a moving edge-count trajectory (chain not locked at empty or full graph).

cholesky_update_after_{edge,diag} replaced the per-accept O(p^3) arma::solve(trimatu(L), I) + inv_L * inv_L.t() covariance refresh with one Sherman-Morrison-Woodbury rank-2 update (4 x O(p^2)) for the edge path and one rank-1 update for the diagonal path. The Cholesky factor L is still maintained via the existing cholupdate / choldowndate pair (needed for get_log_det). inv_cholesky_of_precision_ is no longer maintained per accept - only refresh_cholesky() touches it - so it's effectively scratch state owned by the refresh path. Capacitance singularity (|det C| < 1e-14 in the edge path, |1 + alpha * cov(i,i)| < 1e-14 in the diagonal path) falls back to refresh_cholesky(), preserving the prior safety net. End-of-sweep drift check check_and_refresh_if_drift_() guards SMW-accumulated FP error by computing max_i |sum_k cov(i,k) * K(k,i) - 1| in O(p^2) and refreshing when it exceeds kCovDriftTol_ = 1e-8. Wired into both do_one_metropolis_step (within-model sweep) and update_edge_indicators (between-model sweep). Net effect on AMH GGM edge-update per accept: 2 x O(p^3) (solve + outer product) -> 4 x O(p^2) + O(p^2) check. Full test suite (9736 PASS) green; existing GGM cells (85 PASS) and the p=20 hierarchical regression cell exercise the new path without spurious refreshes.

Per-iteration sign(V_curr) and log|V_curr| are now maintained inside GGMModel and surfaced as a top-level v_ratio_diagnostics field on the bgm() return for hierarchical-spec fits. Joint-spec fits leave the field NULL. GGMModel carries current_sign_V_ / current_log_abs_V_ / v_diag_initialized_. Both branches of update_edge_indicator_parameter_pair seed the diagnostic from V_pair.curr on the first finite proposal and advance to V_pair.star on accept. The state is stable on reject - matches the spec's "sign stays at the previous step's value on auto- reject" rule. BaseModel gains three virtuals (has_v_ratio_diagnostics, current_sign_V, current_log_abs_V) with safe defaults; GGMModel overrides them under graph_prior_spec_ == Hierarchical. ChainResult gains v_sign_samples / v_log_abs_samples and the matching reserve / store helpers, mirroring the existing AM/NUTS diagnostic pattern. R-side: convert_results_to_list packs per-chain v_sign / v_log_abs. build_raw_samples_list (GGM branch) renames to the trailing-__ convention. build_output_bgm assembles a top-level v_ratio_diagnostics = list(sign = list-of-chains, log_abs = ...). bgms_class declares the S7 property and s3_list_to_bgms passes it through (both required - missing either silently NULL-coerces the field via the S7 accessor). Test in test-ggm-hierarchical.R confirms presence, shape, finite log_abs values, and that >= 95% of recorded signs are +1 in the operational cell (q=5, δ=0.5, κ=1, ρ=0.5); joint-spec fit must leave the field NULL. Full suite: 9746 PASS, 0 FAIL.

New exported bgms_posterior_mean(fit) computes the Lyne (2015) sign-corrected ergodic estimator E[f(K) | Y] ≈ Σ sign_iter · f(K_iter) / Σ sign_iter for the pairwise, indicator, and residual-variance fields. Under the operational regime (low δ, reasonable κ) sign === +1 and the correction collapses to the plain colMeans-based posterior mean; sign correction matters at high δ or aggressive κ where the Russian-Roulette estimator of 1/Z(Γ) can flip sign. Joint-spec fits and any fit without v_ratio_diagnostics fall through to the existing posterior_mean_* fields unchanged. Raises a helpful error when the sign vector sums to zero (degenerate regime; remediation is to raise kappa). Three test_that cells in tests/testthat/test-ggm-hierarchical.R exercise the helper: - Passthrough: on a sign === +1 hierarchical fit, the helper returns bit-equal results vs the plain posterior_mean_* fields; on a joint-spec fit, returns the plain fields unchanged. - Sign-correction math: mutates the sign vector on a real hierarchical fit (via a plain-list view to skirt S7 immutability) to a 2/3 +1, 1/3 -1 pattern and asserts the helper's output equals the hand-computed sign-weighted mean while differing from the plain mean by more than numerical noise. NULL fields preserve their slot via the `out["main"] = list(NULL)` idiom so the returned list always has the same names regardless of model type. Full suite: 9756 PASS, 0 FAIL.

The hierarchical-spec correction in update_edge_indicator_parameter_pair had the sign of the V(Γ_star)/V(Γ_curr) factor inverted in both ADD and DELETE branches. The math: T_hier(K, Γ) = T_joint(K, Γ) / Z(Γ) T_hier_ratio(c→s) = T_joint_ratio × Z(Γ_curr)/Z(Γ_star) = T_joint_ratio × V(Γ_star)/V(Γ_curr) [V ≈ 1/Z] ln_alpha_hier += log|V_star| - log|V_curr| bgms had `ln_alpha += V_pair.curr.first - V_pair.star.first` (the reverse). The accompanying comment also stated the wrong direction. Z's reference chain (branchB_chain_route3a_degord.cpp:412) has the correct sign: log_r = std::log(std::abs(V_star)) - std::log(std::abs(V_curr)); Why the q=5 SBC didn't catch it. The gamma-marginal calibration test tolerance scales as 4·sqrt(p_inc·(1-p_inc) / (R · n_edges)). At q=5, n_edges=10, R=300 the tolerance is ~0.037; at q=10, n_edges=45, R=500 it tightens to ~0.013. The bias scales with q (more edges accumulate the misdirected V-weighting); at q=5 the bias was within the looser tolerance, at q=10 it shows clearly. Validation: q=10 SBC sweep at the Z-matched chain config (n_warmup = n_iter = 2000, M_inner = 50, kappa = 2.0, rho = 0.5, R = 500) across delta ∈ {0, 1, 2}. Pre-fix Post-fix delta = 0 8/11 FAIL 10/11 PASS (K_ii[5] ks_p=0.008, borderline) delta = 1 1/11 FAIL (gamma) 11/11 PASS delta = 2 4/11 FAIL 11/11 PASS Total 13/33 FAIL 32/33 PASS The single borderline failure at delta=0/K_ii[5] (ks_p=0.008 vs the 0.01 threshold) is well within expected multiple-testing noise (~0.3 false positives expected across 30 K_ii cells at alpha=0.01). New gated slow test tests/testthat/test-sbc-ggm-hierarchical-q10.R locks the fix in - 22.6 min wall on 12 cores at this config.

Two cross-validated changes to the hierarchical-spec V/RR machinery from the 2026-05-21 companion-AI exchange. mh_U: at sweep boundary, refresh the auxiliary U-pool via a V-ratio MH step instead of a fresh draw from the prior. Fresh-from-prior broke the augmented PMMH target (p(U|G,K) is V*mu, not mu alone) and caused a small but systematic Γ-marginal bias (~-0.001 nats at p=20, p_inc=0.05; 5/5 seeds same sign). MH-on-U with proposal symmetry on (μ, P(N)) reduces the MH ratio to log|V_new|-log|V_old|; V_old is read from the running V cache to avoid a redundant evaluation. Verified on bgms-side: bias collapses to MC noise (2 neg / 2 pos across 4 seeds at p=20). manuscript_NLO: port of the App C eq:NLO-decomp closed form (R/src/ manuscript_NLO.h, 2026-05-21). At α=1, δ=0 bit-identical to bgms's log_Z_NLO_gamma; diverges in the B_e term at δ>0. With mh_U on, this becomes a mixing-efficiency lever rather than a correctness fix, but at p=20, p_inc=0.05 the centring choice still affects the residual bias (5.8× reduction in measured bias for mNLO vs bgms NLO without mh_U). End-of-chain diagnostics_summary surfaces per-direction MH auto-reject classifications and mh_U accept/attempt counters. Both flags default false; SBC-clean baselines untouched.

# Conflicts: # R/RcppExports.R # src/RcppExports.cpp

Checkpoint commit before pausing the V/RR work for the GG-prior project. Bundles four logically separable changes: 1. Sign-flip auto-reject removed. Chain now targets the Lyne (2015) |V|-augmented density π·|V|·μ·P(N) and tracks sign(V) per iteration; ergodic averages use the existing bgms_posterior_mean() helper for the sign-corrected mean. Empirical mean(sign(V)) = 1.0 in operational cells, so the correction collapses to the plain mean — but the prior auto-reject was a directionally biased approximation we are no longer taking. 2. Live diagnostic. chain_runner writes per-iteration (K_depth, sign, wall, mh_U accept-window) lines to BGMS_LIVE_DIAG=<path> when set; surfaces the warmup→sampling cost cliff and the K-dwell pattern that drives slow PMMH-on-RR mixing. 3. WarmupSchedule-gated U refresh. mh_on_U_step moved out of prepare_iteration into the new BaseModel::refresh_auxiliary_u() hook called only when schedule.u_refresh_enabled(iter) is true (i.e., stage 3c + sampling). Stops the U/K_depth state from drifting via PMMH dynamics during stages 1-3b when no between-Γ moves consume it. 4. Local-K kernel. mh_U_local_K_ flag activates a local random-walk MH on the RR truncation depth (K_new ∈ {K_old−1, K_old+1} with reflection at 0). Symmetric proposal, geometric-prior ratio ρ^(K_new−K_old) in the MH numerator, ±log 2 boundary corrections. Mixed 1-in-50 with the fresh-from-prior global step to keep the long- jump escape route alive. Attacks the K-dwell trap that the global kernel cannot escape efficiently at moderate p_inc. 5. Plug-in mNLO mode (plug_in_nlo_ flag). Bypasses the entire V/RR/U machinery in the between-Γ MH; uses log_Z_NLO(Γ_curr) − log_Z_NLO(Γ_star) as the deterministic Z-ratio approximation. Trades small bounded bias for predictable cost; per-toggle wall is flat in p instead of scaling with the random K_depth. At p=20 across p_inc ∈ {0.05, 0.5, 0.95} the plug-in chain runs 30-70× faster than the exact RR+mh_U+local-K chain with similar Γ-marginal bias. Empirical findings (held for the SD/companion follow-up): - Local-K works at p=20 sparse (mean K 3.03 → 2.15) but breaks at p=20 dense (mean K → 11) and saturated (mean K → 29). The augmented marginal genuinely lives at high K under heavy-tailed |V| from imperfect centring; the kernel is correct but operationally costly. - Γ-marginal correctness gate at q=10/20 with the local-K kernel: per-edge pip vector matches Bern(p_inc) within autocorr-adjusted MCSE; sign(V) ≈ 1 throughout. Bias is a *cost* problem (slow Γ mixing at high K), not a kernel problem. - Plug-in mNLO is the practical out: q=20 joint takes seconds, exact- RR takes 15-30 minutes for the same chain length, plug-in takes the same seconds as joint with bias bounded by the mNLO centring error. No tests added; this is a checkpoint, not a PR. Defaults preserve existing chain behaviour (mh_U=FALSE, plug_in_nlo=FALSE, etc.).

…caffolding Adds an SD-based MH ADD/DELETE chain for edge indicators in L-coordinates (K = L Lᵀ), behind the use_sd_between_step + use_sd_lspace flags. The L-space parameterisation removes the PD-cone constraint from the proposal (any l_ji ∈ ℝ yields PD K), and at α=1 the conditional on l_ji is closed-form Gaussian, so the per-edge between-step is exact Gibbs — no Newton, no NLO, no truncation residual. Validated prior-only at q=50, σ ∈ {0.25, 1.0}, p_inc ∈ {0.05, 0.30, 0.50}, α=1, δ=0: all six cells unbiased (|z| ≤ 1), PD-revert canary fires only at genuine numerical near-PD events and the chain reverts cleanly. Components: - src/models/ggm/sd_density_at_zero.{h,cpp}: 1D Laplace+NLO primitive for the SD log-density at K_ij = 0 (ported bit-for-bit from Z's reference; 641 tests pass against the standalone primitive). - src/math/cholesky_helpers.{h,cpp}: perm_to_trailing_2x2 — Bunch-style symmetric adjacent swaps + Givens rotations to extract (l_ii, l_ji, l_jj) from the DEGORD-permuted Cholesky factor. Bit-for-bit against chol(P K Pᵀ) across p ∈ {3..20} (7685 tests pass). Kept as a library helper; the SD hot path uses the simpler direct-Σ read after per-accept arma::chol keeps L exactly in sync with K. - update_edge_indicator_parameter_pair_sd_lspace: per-edge SD between-step reading Σ_BB directly from covariance_matrix_, computing (l_ii, l_ji) via the 2×2 inverse, applying the closed-form Gaussian conditional, MH ratio in L-coords (no Jacobian — cancels between numerator and denominator), and committing via per-accept arma::chol of K (O(p³)) with a PD-revert canary on failure. - n_pd_reverts_ counter + Rcpp warning at end of chain when non-zero. - Test interfaces ggm_sd_smoke_cpp (SD chain prior-only/PIP) and ggm_plug_in_smoke_cpp (DEGORD plug-in mNLO comparator). Two architectural variants attempted and reverted (see git diff history in branch): SMW-only accept (skip per-accept chol; maintain Σ via SMW) and drift-bounded L (rank-2 chol-up + SMW + periodic refresh). Both fail because the algebraically-updated K and the rank-r-updated L (or Σ) drift apart, and the next periodic chol(K) finds K numerically non-PD. Keeping K = L Lᵀ exactly without per-accept full refactorisation appears to require recomputing K from L every accept (O(p³/2)) — same order as arma::chol, so no real speedup. Wall time at q=50, 3 cells parallel: sparse (p=0.05) ~43s, dense (p=0.50) ~410s; matches the per-accept O(p³) cost. Plug-in mNLO under prior-only is biased (z ranging -405 to -2721 across the six cells; PIPs at 22-50% of target depending on σ). Suggests the plug-in approximation contributes a non-zero between-Γ term under the prior — useful diagnostic, separate follow-up.

…H proposal correction Extends the L-space Savage-Dickey between-step beyond the closed-form α = 1 case. At α > 1 the conditional on l_ji gains a non-Gaussian (α − 1) log(s_jj + l_ji²) term — log-concave over ℝ in operational regimes — and Gibbs sampling is no longer available in closed form. Two new primitives: - sd_density_l_space.{h,cpp}: Newton + Laplace + Tierney-Kadane NLO over ℝ. Fast (one Newton iteration in steady state) and bit-for-bit against the α=1 closed form. But fragile in near-bimodal cells (small s_jj × large α − 1): Newton stalls at saddles where f''(x_mode) ≈ 0, status returns non-zero, and a chain that simply reverts those toggles biases its γ-marginal (we observed z = ±60 in worst cells when using Laplace alone). - sd_density_l_space_quad.{h,cpp}: 64-node Gauss-Hermite quadrature for log_Z. Nodes/weights computed once via Golub-Welsch (eigen-decomposition of the tridiagonal Jacobi matrix) and cached. Numerically reliable across all chain configurations — machine-precision agreement with adaptive quadrature for our smooth integrand, including the cells where Laplace diverged. ~64 log/pow evaluations per call. Chain integration: - α = 1 path unchanged (closed-form Gaussian, validated previously). - α > 1: quadrature primitive for log_BF at m_ij (both post and prior), Laplace primitive as proposal mode/curvature (with Gaussian fallback if Newton fails), explicit MH correction term log[π_target / q_proposal] evaluated at the sample point (ADD) or current state (DEL). The correction closes the proposal-mismatch component of bias that quadrature alone would leave. Validation (q=50 prior-only, σ=1.0, α=2.0, post-fix): | p_inc | PIP | z | reverts | wall | (before, Laplace-only) |-------|--------|-------|---------|-------|---------------------- | 0.05 | 0.0501 | +0.84 | 13 | 93s | PIP=0.0775 z=+60 reverts=51k | 0.30 | 0.3000 | -0.25 | 21 | 296s | PIP=0.3088 z= +9 reverts=27k | 0.50 | 0.5000 | +0.11 | 37 | 449s | PIP=0.4632 z=-35 reverts=29k Skip-revert events down ~1000×; PIPs within ±1σ of the Bernoulli(p_inc) stationary target. Wall time ~2-2.5× the α=1 path due to quadrature overhead (~200 extra log/pow per proposal × ~13M proposals/cell ≈ 50s) — pure tax for non-Gaussian conditional. Speed optimization (hybrid Laplace/quadrature, GH-32, log_Z caching) is the next step on top of this correctness baseline. q=10 full sweep over α ∈ {1.5, 2, 3} × σ ∈ {0.25, 1.0} × p_inc ∈ {0.05, 0.30, 0.50} also validates: max |z| = 2.34, all reverts ≤ 3.

The Gauss-Hermite quadrature for the conditional log-Z at α>1 was using 64 nodes; reduces to 32 with no observable accuracy change (GH is exact for polynomials of degree 2N−1 and the integrand is smooth, so both N values give >10-digit precision for our chain). q=50 prior-only σ=1.0 α=2.0 wall: | p_inc | GH-64 | GH-32 | Δ |-------|-------|-------|------ | 0.05 | 93s | 71s | -23% | 0.30 | 296s | 274s | -7% | 0.50 | 449s | 427s | -5% PIPs and revert counts are bit-identical to GH-64 at all 3 cells. The shrinking relative speedup at higher p_inc reflects the per-accept O(p³) arma::chol increasingly dominating as accept count grows; GH overhead is fixed-ish per proposal and amortises over chol work. A hybrid Laplace+GH path (use Laplace+NLO when status=0, GH otherwise) was tried and reverted: Laplace+NLO at status=0 still has ~1e-2 residual error in many cells, which compounds in the chain into multi-σ bias (z=+38 in worst case at q=10). Pure GH-32 is the right correctness/speed point.

…y handling Five fixes spanning the SD between-step's prior convention and a long-standing silent-drop in bgm(). SBC at q=10 (Γ, K_diag, K_off|γ=1) passes uniformity after these; 40-cell prior recovery unaffected. R/bgm.R: when both `inclusion_probability` (deprecated) and a non-default `edge_prior` are supplied, previously the deprecated value was silently dropped after a warning. Now: honor the deprecated arg when only it is given; require values to match when both supplied; error on conflict. src/models/ggm/ggm_model.cpp (SD between-step): - Slab convention: bgms evaluates the slab at -K_ij/2 (K_yy scale), giving K_ij sd = 2·prior_sigma_. The SD primitive documents slab on K_ij directly, so pass σ_K = 2·prior_sigma_ to align. Affects both K-space (update_edge_indicator_parameter_pair_sd) and L-space variants. - Diag prior factor: K_jj/2 ~ Gamma(α, β) means K_jj ~ Gamma(α, β/2). The L-space SD tau formulas had `2β`; the correct contribution to tau is `β`. Fixed at α=1 and α>1 paths. - Explicit PD canary in update_diagonal_parameter prior_only mode; previously the diag update relied on K_ii > 0 as an implicit PD check, which is too weak and let non-PD K accumulate at shallow det-tilt. src/log_z_test_interface.cpp: ggm_sd_smoke_cpp gains - `sample_thin` to emit thinned K samples for SBC rank analysis, - `edge_selection` flag for fixed-graph diagnostics, - K_offdiag samples are stored as K_ij * γ_ij so inactive edges are exact 0 (eliminates ~1e-18 floating-point noise from Roverato DEL cancellation that otherwise confounds spike-slab rank statistics), - exposes `final_K` for SBC data-simulation harnesses.

- tests/testthat/test-sd-density-at-zero.R: replace compound semicolons with newlines (semicolon_linter warnings, fatal under LINTR_ERROR_ON_LINT) - _pkgdown.yml: add bgms_posterior_mean to Posterior methods section - man/bgm.Rd: regenerate from roxygen (example assignment style sync)

ggm_sd_smoke_cpp gains a within_k_mode flag ("am" | "nuts"). The "nuts" path runs an unconstrained NUTS step on the active theta each sweep (dual-averaging step-size adaptation in warmup), composed with the SD between-step that toggles edges. Lets SBC isolate the within-K choice from the SD between-step's contribution to chain dynamics. Under prior_only=TRUE the smoke now constructs the model with n=0 and S=0 (mirrors sample_ggm_prior). The AM within-K checks prior_only_ at the per-step MH ratio, but the NUTS gradient engine bakes n_ and suf_stat_ into the log-posterior with no such flag — so without this the NUTS path would silently include the data under prior_only.

The L-space SD primitive's kernel f(l_ji) = -A l_ji² + B l_ji + (α-1) log(s_jj + l_ji²) drops the (n/2 + δ) log|K(l_ji)| term from the data likelihood determinant and prior tilt. K_ij and K_jj both depend on l_ji along the Roverato slaving trajectory, so log|K| is NOT l_ji-invariant — the kernel as written targets π_FALSE = π_TRUE / |K|^{n/2+δ}, not the true joint posterior. Empirically (SBC at q=10, n=200, p_inc=0.3, α=1, δ=2, 200 reps): K_diag pooled KS-p: 0.029 → 0.181, mean PIT: 0.512 → 0.509. Γ and K_off|γ=1 stay calibrated. The bias was data-dependent (calibrated under prior_only; reappears under posterior) because the missing term is proportional to (n/2 + δ) and matters when the SD step toggles γ often. Hidden in v9/v10 at p_inc=0.5 by the α=1 + p_inc=0.5 + prior_only degenerate accept-everything regime and at high n by data-driven PIPs near {0, 1} (chain rarely toggles γ). Fix: after sampling l_ji_new and computing the proposed K_ij_new, K_jj_new, add (n/2 + δ) · log|K_after|/|K_before| to log_alpha. By standard MH theory this shifts the chain's effective target from π_FALSE to π_TRUE — proposal can stay Gaussian (α=1) or Laplace+GH (α>1); both already in place. Implementation note: inline the rank-2 matrix-determinant lemma's 2×2 form rather than calling log_det_ratio_edge — the sign of cc11·cc22 − cc12² (under the lemma's sign-flip convention here, negative = PD) detects PD violations without an arma::chol, keeping the per-attempt overhead negligible. Wall-clock cost in 200-rep SBC mclapply: +1.7%.

arma::chol returns false only for strict non-PD K, but the SD between-step sometimes accepts barely-PD K — one pivot ~ 1e-10 — that passes the chol canary. The next within-K NUTS step then evaluates K^{-1} entries on the order of 1/min_pivot, gradient overflows, theta becomes inf, K becomes garbage. We saw K_jj samples ~ 1e13 at extreme priors (σ=1.0, p_inc=0.5, low α, δ=0). Add a min(diag(U)) < 1e-6 check alongside the chol canary. The threshold corresponds to a minimum K eigenvalue ~1e-12, well above double-precision noise but loose enough not to perturb well-conditioned chains. Catches the SD-step contribution to runaway K_jj; the chain still has heavy-tailed exploration at extreme priors from within-K NUTS itself, which n_pd_reverts now flags more reliably as a warning bell.

Pure-R prototype of the (gamma, K_ij, omega) auxiliary-variable chain at fixed rest-conditional state. Three configurations (weak / borderline / strong S_ij in the alpha=1 closed-form regime), 5 seeds * 200k iterations. Findings: * The analytic marginal posterior P(gamma=1) is P(gamma=1) = pi_inc * BF_inc_marg / (pi_inc * BF_inc_marg + (1 - pi_inc)) where BF_inc_marg = E_omega[BF_inc(omega)] = E_omega[1 / BF_excl(omega)]. This is NOT 1 / E_omega[BF_excl(omega)] (Experiment 2 was computing the wrong quantity). The corrected analytic is now what the marginal Gibbs chain matches to MC error. * The marginal Gibbs reference chain (iid samples from the analytic marginal) matches the analytic to z < 1 across all configurations. * The auxiliary (gamma, K, omega) chain mixes essentially iid (ESS ~ T) but shows a reproducible positive bias in P(gamma=1) of 0.5%-1.5%. Bias is largest at the borderline regime (P(gamma=1) ~ 0.4-0.5) where gamma flips most often -- suggests a subtle issue at the MoMS mutually-singular boundary. Each conditional update has been re-derived and looks correct; the K-space transform mu_K = l_ii * (mu_l - m_ij), sd_K = l_ii / sqrt(tau_l) is verified. Cause not yet identified. Implication: the auxiliary scheme appears "essentially correct" but is not bit-exact to the analytic. Worth investigating further before committing to the C++ implementation. Likely candidates to probe next: - run with K removed from state (collapsed (gamma, omega) with non-conjugate omega update by rejection sampling), see if bias persists - compare against a 2D quadrature for the joint marginal P(gamma=1), independent of the chain mechanism - look at the alpha > 1 case (uses sinh primitive, not the closed form) to see if the bias is alpha-dependent

Three probes investigating the 0.5-1.5% chain bias from Experiment 3. Probe (a) -- cross-route verification of BF_inc_marg: omega-route (1D integral over omega using bgms's SD formula) gives 0.20, 0.77, 9.40 for weak / borderline / strong. K-route (1D integral over K with pure Cauchy slab) gives 0.12, 0.45, 5.44 -- a near-constant factor of ~0.58 different. 2D route over (K, omega) agrees with K-route exactly (sanity check). These are not the same Bayes factor: bgms's SD includes the diag prior contribution in BOTH numerator and denominator of the L-space ratio at m_ij. The diag prior cancels at fixed K_ij value but NOT under the scale-mixture integral over omega. The K-route corresponds to the natural "spike vs. pure Cauchy on K_ij" BF; the omega-route corresponds to a different BF involving the conditional-Roverato-prior structure. Probe (b) -- collapsed (gamma, omega) chain via slice sampling: Matches the omega-route analytic essentially exactly (z = 0.06, 1.02, -2.02). So the slice chain is faithful to whatever the omega-route is. The Experiment 3 aux chain matched the same omega-route to ~1% residual; that small residual is a real MoMS-boundary artifact in the (gamma, K, omega) update, but the BIG bias from Experiment 3's first reporting was that the analytic was computed wrong (omega-route is what the chain targets; K-route is the "natural" Cauchy BF). Probe (c) -- alpha > 1 via sinh primitive: Same chain at alpha = 3 matches omega-route well (z = -0.94, -3.24, +0.66). Borderline is slightly off in opposite direction from alpha=1 -- suggests the residual is a small MoMS artifact, not a method bug. Conclusion: this is a modelling question that has to be answered before any C++ change. Either (Omega) keep bgms's existing SD-formula interpretation -- aux scheme drops in trivially with the sigma^2 -> sigma^2 omega_ij substitution; or (K) rework log_bf_excl so the "prior" side is the pure slab -- gives the natural Cauchy-on-K_ij BF, but also changes the existing Normal-slab BFs by the same ~1.7x factor. Code: experiments/cauchy-slab/04-bias-diagnostic.R

…tion Experiment 5 -- toy spike-and-Cauchy-slab on a single Gaussian observation y ~ N(K, 1), with no L-space / Roverato / SD primitive machinery. Closed- form BF_inc(omega), analytic marginal verified to MC. The aux chain (gamma, K, omega) with conjugate omega | K ~ IG(1, 1/2 + K^2/(2sigma^2)) matches the analytic to z < 1 across weak / borderline / strong evidence. => The (gamma, K, omega) Gibbs-scan structure is not the source of bias. Experiment 6 -- beta sweep of the Experiment 3 GGM-toy. At beta = 0.5 (default) the aux chain shows the original 1% bias: z = +6.15, +8.41, +20.10 across weak / borderline / strong. At beta = 0, bias collapses to z = +1.48, -0.82, -0.39 -- pure MC noise. Mechanism. Under bgms's L-space framework with rest_L = (l_ii, c_3, ...), the effective prior on K_ij | gamma=1, omega includes a diag-prior contribution (the beta/(2 l_ii^2) term in A_prior). That introduces an omega-dependent normalisation Z(omega) into the conditional p(K | gamma=1, omega, rest_L). My pure-slab conjugate update for omega silently drops Z(omega). At beta = 0 the term is absent (pure slab) and conjugacy is exact -- which is why the toy works and beta=0 in Experiment 6 works. Fix for the production C++. Replace pure-slab conjugate omega | K with slice sampling on the full L-space-consistent omega conditional: log p(omega | K, gamma=1, rest_L) prop -K^2 / (2 sigma^2 omega) - log Z(omega) + log p_IG(omega) log Z(omega) = 0.5 log(pi / A(omega)) + B^2 / (4 A(omega)) A(omega) = 1/(2 sigma^2 omega) + beta / (2 l_ii^2) B = -beta * m_ij / l_ii ~15 lines of C++ for a log-space slice sampler (stepping-out + shrinkage). Per-step cost: a handful of arithmetic ops per evaluation, ~5 evaluations per slice step. Verified bias-free by Experiment 5 (which uses slice on omega and matches analytic to MC noise). Cauchy implementation is now unblocked.

…rmals Adds Cauchy(0, sigma) slab support to the hierarchical-spec SD path. The Cauchy is represented as K_ij | gamma=1, omega ~ N(0, sigma^2 omega) with omega_ij ~ InvGamma(1/2, 1/2) auxiliary mixing weights, one per edge. The existing SD primitives (cubic_mode, sinh_midpoint, gauss_hermite, laplace) are unchanged -- the Cauchy lifting is just sigma^2 -> sigma^2 * omega_(i,j) at the call site, plus a per-edge omega update after the (gamma, K_ij) MH step. C++ changes - GGMModel gains a p x p omega_ matrix (upper-triangular storage; lower triangle and diagonal unused), initialised to 1.0 (= Normal slab pointwise) and copied on chain clone. - GGMModel gains a slab_is_cauchy_ flag, set by ensure_prior_params_extracted_() via dynamic_cast on the configured interaction_prior_. CauchyPrior gains a scale() accessor. - update_edge_indicator_parameter_pair_sd_lspace uses sigma^2 * omega_(i,j) when Cauchy; the Normal path stays bit-identical. - New slice_sample_omega_ij_() draws from the L-space-consistent full conditional p(omega | K_ij, gamma=1, rest_L). The conjugate InvGamma(1, 1/2 + K^2/(2 sigma^2)) update is wrong here (Experiments 3/5/6 in experiments/cauchy-slab/) because the diag-prior contribution to the L-space "prior" introduces an omega-dependent Z(omega) factor. Slice sampling on u = log(omega) with stepping-out + shrinkage handles it. At gamma=0 the slab is irrelevant; omega draws from its IG(1/2, 1/2) prior. Called from update_edge_indicators after each SD step. - rng_utils.h gains rgamma(shape, rate) for the IG-from-prior sample. R changes - bgm_spec.R relaxes the hierarchical-spec slab restriction from Normal-only to {Normal, Cauchy}. Choosing cauchy_prior + Gamma diag now defaults to prior_factorization = "hierarchical" instead of "joint". Existing cauchy_prior + (no factorization given) users on bgmCompare or omrf are unaffected (those paths still require joint). Verification - sd-baseline-snapshot: 42/42 pass (Normal slab bit-identical, both alpha = 1 and alpha = 3). - Smoke test: bgm() with cauchy_prior(scale = 0.5) on a q=4, n=80, one-true-edge problem detects the edge (PIP = 1.00) and null edges cleanly (PIPs 0.14 - 0.29), posterior mean for the true edge ~ 0.23. This is MVP. Not yet covered: per-iteration omega snapshots in the output, SBC for the Cauchy + hierarchical path, alpha > 1 with sinh primitive (works in the code but unexercised in tests).

…ickey/ The slice-sampling logic + the Cauchy-via-IG(1/2, 1/2) log-conditional are pure scalar math taking (K, sigma^2, A_diag_K, B_K, omega_curr). The only model-specific piece is computing A_diag_K and B_K from the Cholesky/Roverato state. Mixed MRF and other Cholesky-parameterised models will derive analogous expressions and call the same primitive. Refactor New src/math/savage_dickey/cauchy_omega.{h,cpp}: * slice_sample_cauchy_omega_active(K, sigma2, A_diag_K, B_K, omega_curr, rng) * sample_cauchy_omega_prior(rng) Naming + Doxygen header match the existing primitives (cubic_mode, sinh_midpoint, laplace, gauss_hermite). Math derivation in the file header with the manuscript-pointer convention; the implementation cpp stays terse. GGMModel::slice_sample_omega_ij_ becomes a thin wrapper: extracts l_ii, m_ij from the maintained covariance/precision, computes the GGM-specific A_diag_K = beta/(2 l_ii^2) and B_K = -beta m_ij / l_ii, and forwards. Drops ~80 lines of inline math/algorithm code from ggm_model.cpp. Tests New tests/testthat/test-sd-cauchy-omega.R (6 passes): 1. sample_cauchy_omega_prior matches IG(1/2, 1/2) quantiles + tail. 2. slice sampler at A_diag_K = B_K = 0 matches the conjugate IG(1, 1/2 + K^2/(2 sigma^2)) reference. 3. slice sampler matches an importance-sampling reference with nonzero A_diag_K and B_K. Tests run a proper chain (iterative slice steps) rather than one-step transitions from a fixed omega, so the chain quantiles reflect the actual stationary distribution. tests/testthat/test-ggm-hierarchical.R: * Replaced the stale "errors helpfully for Cauchy slab" test (which asserted Cauchy + hierarchical should fail) with a success path test. * Added a one-true-edge-recovery test (q=4, n=120, K_12 = -0.5): true edge PIP > 0.95, all five null edges PIP < 0.4. Test interface in src/savage_dickey_test_interface.cpp gains sd_slice_sample_cauchy_omega_active_cpp + sd_sample_cauchy_omega_prior_cpp. Verification sd-baseline-snapshot 42/42 (Normal-slab path bit-identical). sd-cauchy-omega 6/6. ggm-hierarchical 17/17. sd-cubic + sd-sinh suites unchanged.

test-ggm-hierarchical.R gains: - Cauchy + hierarchical + Gamma(shape=3) test exercising the sinh primitive (alpha > 1 dispatch is otherwise unverified in the Cauchy-slab path). test-sbc-ggm-hierarchical-cauchy.R (new, gated by BGMS_RUN_SLOW_TESTS): - Companion to test-sbc-ggm-hierarchical.R, swapping the slab to cauchy_prior. Truth via sample_ggm_prior() with the Cauchy slab at fixed Gamma (NUTS at fixed graph; independent of the SD chain). Inference via bgm() with cauchy_prior + hierarchical. Tests rank uniformity of K_ii per delta in {0, 0.5, 1.0} via KS at alpha = 0.01, plus gamma marginal calibration. R = 200 reps (smaller than the Normal-slab SBC's R = 300 to keep total runtime under an hour). - Smoke-verified end-to-end on one rep (sample_ggm_prior with cauchy_prior produces a valid truth; bgm() with cauchy_prior + hierarchical produces a non-degenerate fit). Deferred to a follow-up commit - Per-iteration omega snapshots in fit output (touches ChainResult, chain_runner, build_output, and the S7 fit class — substantial enough to scope separately).

The 6 R scripts under experiments/cauchy-slab/ were exploratory design work; the corresponding findings are already captured in commit messages (notably 101d99c, which spells out the diag-prior / omega conjugacy bias and the fix) and in src/math/savage_dickey/cauchy_omega.h's file header. Keeping a tracked top-level experiments/ directory diverges from the existing convention -- exploratory scripts live in dev/ which is gitignored per .gitignore:18 (cf. dev/quadrature/, dev/numerical_analyses/, dev/sbc/). Files copied to dev/cauchy-slab/ locally before removal; the gitignored copy remains available on this machine. Reverts the .Rbuildignore ^experiments$ entry.

The hierarchical-spec SD path now supports the Cauchy slab via the scale-mixture-of-normals representation (added in 93b4b14), but several docstrings still said "Normal slab only": R/bgm.R - @param interaction_prior: drop the "Normal slab is the encompassing prior" claim; describe both Normal-direct and Cauchy-via-scale-mixture. - @param prior_factorization: drop "Only available when the interaction prior is normal_prior(...)"; replace with "Normal or Cauchy slab, Gamma diagonal". - @param prior_factorization auto-resolve clause: include Cauchy in the set that lands on hierarchical, and update the "joint otherwise" example to beta-prime (which actually doesn't support hierarchical). - Updated quadrature name: alpha > 1 path is sinh-substitution, not Gauss-Hermite (sinh-midpoint replaced GH in 77437d8). src/models/ggm/ggm_model.h - GraphPriorSpec enum header: hierarchical now requires Normal *or* Cauchy slab + Gamma diagonal; points at math/savage_dickey/cauchy_omega.h for the slice-sampled per-edge omega update under Cauchy. - set_graph_prior_spec docstring: same. - graph_prior_spec_ member comment: "Normal or Cauchy slab". - Quadrature name fix here too. Regenerated man/bgm.Rd via roxygen2. Build verified.

Both primitives were superseded by the sinh-substitution + midpoint-rule quadrature (sinh_midpoint) and the cubic critical-point solver (cubic_mode) when the alpha > 1 path was rewritten in 77437d8. Removed src/math/savage_dickey/laplace.{h,cpp} -- no callers anywhere. src/math/savage_dickey/gauss_hermite.{h,cpp} -- test-only reference; the actual ground truth in test-sd-sinh.R is grid_logZ (R-side Gauss-Legendre with N=2e6 nodes), so the GH comparison was a redundant second opinion. Updated src/models/ggm/ggm_model.cpp: drop the unused includes. src/savage_dickey_test_interface.cpp: drop sd_log_density_at_l_ji_cpp + sd_log_density_at_l_ji_gh_cpp test wrappers. tests/testthat/test-sd-sinh.R: drop the GH comparison block; the test that previously asserted "sinh-32 error <= 10 * GH-32 error" becomes "sinh-32 worst-case error < 1e-3" against the existing grid_logZ reference. Same six configurations. R/RcppExports.{R,cpp} regenerated via roxygen. Verification sd-baseline-snapshot 42/42 (Normal-slab bit-identical). sd-cauchy-omega 6/6. sd-cubic (all pass). sd-sinh (all pass; fuzz N=128 worst error 1.09e-10).

Audit caught three inline comment references in update_edge_indicator_parameter_pair_sd_lspace pointing at code that was removed in the previous commit (623cfda): Line 659: "(math/savage_dickey/laplace.h) documents slab as ..." -- file removed; the surviving primitives in math/savage_dickey/ use the same slab convention. Lines 741-742: "...~1e-1 for the legacy non-adaptive GH-32..." -- the GH primitive is gone; the comparison no longer needs to be made. Line 770: "Replaces the previous Newton iteration (density_at_l_ji_one)..." -- density_at_l_ji_one lived in laplace.h and is removed; the cubic primitive stands on its own merits. Comments rewritten to describe the surviving primitives without referring to deleted code. No behaviour change. All sd-* tests pass.

Documented pre-commit workflow: source("inst/styler/bgms_style.R") styler::style_pkg(style = bgms_style) lintr::lint_package() The styler converted <- to = across the package (the bgms style forces EQ_ASSIGN over LEFT_ASSIGN except <<- and call-argument <-) and normalised closing-paren placement in a handful of older tests with stylistic drift. Files touched R/bgm.R tests/testthat/test-{chol-perm-to-trailing,ggm-hierarchical,ggm-nuts, mixed-gradient-pfaffian,mixed-nuts,sample-ggm-prior, sbc-ggm-hierarchical{,-cauchy,-q10},sbc-ggm,sd-baseline-snapshot, sd-cauchy-omega,sd-sinh}.R The bulk of the change is the <- -> = conversion in the four test files written this session (test-ggm-hierarchical.R additions, test-sd-cauchy-omega.R, test-sbc-ggm-hierarchical-cauchy.R) plus older tests with mixed-style drift. lintr reports zero issues.

R/bgm.R:364 had `fit <- bgm(...)` inside an @examples block since d5b3a2d (2026-05-24). The styler / pre-commit pass doesn't reach inside roxygen `#'` comments, so this survived the bulk = / <- conversion in the previous commit (3757ec7). Fixed by hand; man/bgm.Rd regenerated via roxygen2. grep -rn "^#' .*<- " R/ -> zero hits.

R/bgms-methods.R defined warning_once + a .warning_state env to issue a once-per-session warning, but nothing calls it anywhere in R/ or tests/. Likely an orphan from when a now-removed code path used to emit the warning. Confirmed via the R-side orphan audit (corrected script that excludes only definition lines and S3 methods). Package still loads cleanly with the helper gone.

This reverts commit 80bb195.

…er loops The forward Givens row-rotation and the matching reverse-Givens loops on W_work / W_bar all touched conceptual R via R(r1, k) — a stride-q access in column-major storage. Switching the working R to its transpose turns each row-rotation on conceptual R into a column-rotation on storage, which is contiguous and amenable to SIMD. givens_qr now consumes A_q directly (no .t() at the callsite) and writes R in (m_q × q) layout; R(j, j) still gives the diagonal, R_diag and rotation storage are unchanged. Backward-pass W_work / W_bar mirror the layout and swap (r1, cj) → (cj, r1) at every access; the Q matrix and its rotations are unaffected (Q column-rotations were already contiguous). Bit-identical to the prior implementation across p ∈ {5, 10, 20, 50} × density ∈ {0.3, 0.6, 1.0} × seed ∈ {1, 2, 3} (logp and gradient both match exactly). Per-call median speedup is small (≈1.01–1.04× at p ≥ 10), since L1d already covers the strided rows at these p; the structural win is the removal of the Aq_buf.t() materialisation and clearer storage semantics that any future Householder-QR replacement can build on.

sample_ggm_prior(spec = "joint") routes through sample_ggm() with n = 0 and S = 0 to draw from the joint prior π(K, Γ). The AM chain's per-attempt PD canary (arma::chol of precision_proposal_ inside update_edge_parameter and the diagonal / between-step counterparts) is gated on prior_only_, which the joint-spec entry never set. Without it the chain accepted moves that took K out of the PD cone (no likelihood anchor), drift in Σ compounded across the within-step sweep, and the next refresh_cholesky threw on a now-non-PD K. Now: both GGMModel constructors set prior_only_ = true after initialize_precision_from_mle() when n_ == 0. Chains with data are unaffected — the PD canary stays inactive at n_ > 0. Z reproducer at p = 100, delta = 0.5 * log(100), normal slab scale 0.5, edge_inclusion_prob 0.2 now runs to completion instead of crashing during warmup.

The L-space Savage-Dickey between-step accept path previously did chol + triangular solve + DSYRK matmul per accept to refresh covariance_matrix_ = inv(L) inv(L)^T. The DSYRK is ~54 μs of the 175 μs per-accept refresh at p = 100 under libRblas. Σ_corner is now read on demand inside the sweep as Σ_kl = inv(L)[k,:] · inv(L)[l,:]^T via two arma::dot row inner products per attempt (~0.3 μs each at p = 100), and covariance_matrix_ is refreshed once at the end of update_edge_indicators() for the within-step that follows. inv(L) is refreshed once at the start of the sweep since the within-step's SMW updates leave it stale wrt L. Measured 7–11 % wall-time reduction on bgm(prior_factorization = "hierarchical", update_method = "adaptive-metropolis") at p = 100, larger at smaller n where between-step accepts are more frequent (more refreshes saved per iter). Validated against the existing test-sbc-ggm-hierarchical{,-cauchy}.R harnesses at p_inc = 0.5 (no regression vs the refresh-per-accept path) and a posterior-mean agreement check at p = 100, n = 40: same-seed OFF-vs-ON differences on K_ii and PIP are well inside the cross-seed Monte Carlo noise floor.

cholesky_update_after_edge previously materialised four p×p arma temporaries on each accept (one per (a1 a1ᵀ), (a2 a2ᵀ), (a1 a2ᵀ), (a2 a1ᵀ)). Refactor to two rank-1 outer products with bundled scalar weights: ΔΣ = -(a1 b1ᵀ + a2 b2ᵀ) b1 = inv_c00 · a1 + inv_c01 · a2 b2 = inv_c01 · a1 + inv_c11 · a2 Cuts ~3 of 5 p×p allocations per accept. Wall-time on M5 Pro, p=100 iter=500 warmup=100: hier n=40 : 159.1s -> 71.4s (-55%) hier n=700 : 80.0s -> 35.0s (-56%) joint n=40 : -> 39.3s joint n=700: -> 23.0s A/B against parent commit on the same hardware confirms the win is the SMW refactor (baseline measured by stashing this change and re-running). All ggm tests pass (gradient 14/14, hierarchical 20/20, nuts gated).

cholesky_update_after_edge holds the L update + Σ rank-2 SMW for the symmetric change ΔK = vf1 vf2ᵀ + vf2 vf1ᵀ. Today both inputs are sparse 2-entry vectors (the edge case) but the kernel is generic in vf1, vf2. Lift the kernel into apply_rank2_chol_smw_update_(); cholesky_update_after_edge becomes a thin wrapper that scatters the per-edge deltas into vf1_/vf2_, calls the helper, and zeros them back out. Zero behaviour change: bitwise-identical sample matrices vs HEAD across joint and hierarchical modes at p ∈ {20, 50}, seed=11..12. Existing GGM tests pass (gradient 14/14, hierarchical 20/20). Prepares the row-block Gibbs within-step (next PR) to reuse the same kernel with full-vector vf1 = e_i, vf2 = u_i.

Adds GGMModel::update_row_block_gibbs(i) — a closed-form Gaussian–Gamma draw of row i of K given the rest of K and the graph, under the conjugate prior scope (Normal slab on K_yy off-diagonals × Gamma(1, beta0) on K_ii/2, no determinant tilt). Derivation (in bgms internal scale K_yy = -K/2): - Partition K = ((A, beta), (beta^T, kii)) at row i, with A = K_{-i,-i}, beta = K_{N_i, i}, kii = K_{i,i}. - Reparam xi = kii - beta^T C beta, C = (A^{-1})_{N_i, N_i} extracted in O(|N_i|^2) from cached Sigma via Schur. - xi | rest ~ Gamma(n/2 + 1, (beta0 + S_ii)/2) - beta | rest ~ N(-M^{-1} S_{N_i, i}, M^{-1}), M = (beta0 + S_ii) C + (1 / (4 sigma^2)) I. - kii = xi + beta^T C beta; PD by construction. New column written via the shared apply_rank2_chol_smw_update_ helper (from PR-1) with vf1 = e_i and vf2 carrying the column-i deltas. row_block_gibbs_eligible() gates the conjugate case; non-conjugate combinations return false (PR-4..PR-6 will extend coverage). Tested via a new ggm_test_row_block_gibbs Rcpp entry that calls the kernel in isolation (no AM dispatch yet — that lands in PR-3): 1. q = 0 marginal: K_ii ~ Gamma(1, beta0/2) at n=0, KS Bonferroni pass 2. Agreement vs AM (n=200, p=6, complete graph, 3500 post-burn draws): max|K_gibbs - K_am| = 0.012 mean rel diff = 1.7 % 3. Predicate rejects gamma_shape != 1. Existing GGM tests pass (gradient 14/14, hier 20/20); bitwise check on the AM path (4 cells) still identical — PR-2 adds new methods without touching the existing dispatch.

bgm() / sample_ggm() now accept within_step_kind in {"adaptive_metropolis", "row_block_gibbs"} (default keeps the AM path). The flag plumbs through validate_sampler -> run_sampler_ggm -> sample_ggm.cpp -> GGMModel::set_within_step_kind, and GGMModel::do_one_metropolis_step branches on it. Eligibility: set_within_step_kind warns and downgrades to AdaptiveMetropolis if row_block_gibbs is requested but the prior is not yet supported (need Normal slab + Gamma alpha=1 + delta=0). PR-4..PR-6 will extend coverage; opt-in scripts under unusual prior combinations keep running on the AM path until then. Bug fix while wiring: the copy constructor was missing within_step_kind_ (also missing for new state added by this PR until now), so per-chain clones silently reverted to AM. Added. Validation: 1. Default AM path bitwise-identical to PR-2 baseline (4 cells: joint+hier x p={20,50}). 2. AM vs row_block_gibbs posterior agreement via sample_ggm at p=6, n=200, 2500 post-burn draws: max|diff| = 0.012, mean rel diff = 1.8% (matches the PR-2 standalone test). 3. row_block_gibbs + Cauchy slab raises the expected warning and falls back to AM. 4. Existing GGM tests pass (gradient 14/14, hierarchical 20/20, nuts gated).

Under the determinant-tilt prior |K|^delta, the K_{i,i} log-det contribution becomes delta * (log|A| + log xi), so the conjugate factorization keeps the same shape: xi stays Gamma-conjugate with rate unchanged and shape shifted from n/2 + 1 to n/2 + delta + 1. beta is unaffected (the tilt does not depend on beta given A). Implementation: xi_shape = n/2 + determinant_tilt_ + 1 in update_row_block_gibbs. row_block_gibbs_eligible() drops the delta == 0 check. Validation: AM vs row_block_gibbs posterior means at delta in {0.5, 2} on p=6, n=200, complete graph agree within MCMC error: delta = 0.5: max|d| = 0.013, mean rel = 2.0% delta = 2.0: max|d| = 0.011, mean rel = 1.3% AM-default bitwise unchanged (4-cell harness); ggm tests pass (gradient 14/14, hierarchical 20/20).

The Gamma(alpha, beta_0) prior on K_{i,i}/2 introduces a kii^(alpha-1) factor that does not factorize between xi and beta (kii = xi + beta^T C beta). Plain conjugate Gibbs no longer applies. Instead, treat the conjugate (alpha = 1) joint draw as an independent-MH proposal and accept with probability alpha_MH = (kii_new / kii_old)^(alpha - 1) The beta and S_ii factors cancel since both target and proposal share the same beta prior and likelihood; only the diagonal correction remains. Guarded by std::abs(prior_alpha_ - 1) > 1e-12 so the alpha = 1 path still skips the runif draw and remains bit-identical to PR-2..PR-4. row_block_gibbs_eligible() drops the alpha = 1 requirement; the warn-and-downgrade message updates to cover only the remaining gap (Cauchy slab, lands in PR-6). Validation: AM vs row_block_gibbs posterior means at p=6, n=200, complete graph, Normal slab, delta=0: alpha = 0.5: max|d| = 0.013, mean rel = 1.9% alpha = 2.0: max|d| = 0.013, mean rel = 1.8% AM-default bitwise identical (4-cell harness); ggm tests pass.

The Cauchy slab on K_yy_{ij} = -K_{ij}/2 expands as K_yy_{ij} | omega_{ij} ~ N(0, sigma^2 * omega_{ij}), omega_{ij} ~ IG(1/2, 1/2). Conditional on omega, the joint draw of (beta, kii) becomes Gaussian- Gamma conjugate just like the Normal-slab case, with a per-element prior precision diag(1 / (4 sigma^2 omega_k)) on the beta block. The existing slice_sample_omega_ij_ kernel refreshes omega per edge in a single per-sweep slice pass. Implementation: - update_row_block_gibbs: M now uses M(k, k) += inv_4sig2 / omega_{i, N_i[k]} Under Normal slab omega is fixed at 1, so the formula reduces to the existing constant precision and PR-2..PR-5 stay bit-identical. - do_one_metropolis_step (RowBlockGibbs branch): after the row sweep, when slab_is_cauchy_, refresh inv_L from the SMW-maintained L (O(p^3)), then sweep slice_sample_omega_ij_(i, j) over all edge pairs. The refresh is needed because apply_rank2_chol_smw_update_ maintains L and Sigma but not inv_L. - row_block_gibbs_eligible(): accepts NormalPrior OR CauchyPrior. Validation: AM (Cauchy direct) vs row_block_gibbs (Cauchy via mixture) at p=6, n=200, complete graph, Gamma(1, 1), delta=0: max|d| = 0.009 mean rel = 1.3% Both target the same Cauchy marginal; the row-Gibbs path adds omega to the state vector but integrates it out at stationarity. AM-default bitwise unchanged (4-cell harness); ggm tests pass.

User-facing bgm() takes a single sampler knob: update_method in {"nuts", "adaptive-metropolis", "Gibbs"} instead of the awkward two-parameter combination (update_method = "adaptive-metropolis", within_step_kind = "row_block_gibbs") introduced for row-block Gibbs in PR-3. Internally the C++ entry sample_ggm() still takes the two-knob (sampler_type, within_step_kind) tuple — it sits at the R/Rcpp boundary and "Gibbs" rides on the AM chain runner with the row_block_gibbs within-step. run_sampler_ggm() handles the mapping. target_accept = 0.44 is now the default for both AM and Gibbs (the between-step adapter still consumes it; Gibbs's row sweep has no per-row acceptance to tune). Existing bitwise check unchanged (AM-default path); ggm tests pass.

One block per feature increment so a regression isolates to its commit: exact conjugate draw anchor (prior-only K_ii vs closed-form Gamma) plus Gibbs-vs-AM posterior-mean agreement on a fixed graph at Normal alpha=1 delta=0, delta=0.5, alpha=2, and the Cauchy slab. The agreement blocks are skip_on_cran() since they run real chains.

The row-block Gibbs row draw reads only Sigma, never chol(K): delta enters analytically as a Gamma shape shift, and the Schur extraction uses the covariance. So the two per-row Cholesky Givens passes in apply_rank2_chol_smw_update_ produced a factor nothing read until the between-step. Gate them behind a new update_L flag (true for the edge accept, false for the row sweep) and rebuild chol(K) once via refresh_cholesky() at sweep end, which also makes Sigma exact and subsumes the old conditional drift check. Single-threaded, behaviour-preserving (chol(K) is a deterministic function of K). Sigma is still SMW-maintained per row so each row's Schur extraction is exact. Verified L^T L = K and Sigma K = I to ~1e-16 every sweep under edge selection, and the Gibbs-vs-AM agreement tests still pass. Within-step throughput (fixed graph, single thread): p=100 d=0.1 196 -> 411 sweeps/s (2.1x) p=200 d=0.1 27 -> 67 sweeps/s (2.5x) dense graphs ~1.08x (chol(M)-bound, as expected).

The rank-2 vectors u1/u2 are nonzero only on the update support ({i,j} for an edge accept, {i} u N_i for a row-Gibbs row), so a1 = Sigma*u1 equals Sigma.cols(support)*u1(support). Pass the support to apply_rank2_chol_smw_update_ and gather just those columns: O(p*|support|) instead of the dense O(p^2) gemv. A guard (3*|support| < 2p) falls back to the dense gemv when the support is near-full, so dense graphs never pay the gather copy. The outer-product Sigma updates stay dense O(p^2) (a1, b1 are length-p); this only sparsifies the matvec and the capacitance dots. Bit-identical to the dense path (off-support u entries are exactly zero), so the per-sweep L^TL=K / Sigma K=I invariant is unchanged. Edge accept (support size 2) always takes the gather path. Within-step throughput on top of the deferred-L change (sparse, single thread): p=100 d=0.1 411 -> 491 sweeps/s (1.20x) p=200 d=0.1 67 -> 86 sweeps/s (1.29x) dense graphs unchanged (guard -> dense gemv).

Hoist the element-filled per-row buffers (N_i, beta_old, sigma_iNi, s_Ni_i, C, support) to reused members. clear() on the std::vector keeps capacity, and set_size on the arma buffers reuses the existing allocation when q fits, so a sweep no longer reallocates these each row -- notably dropping the per-row reserve(p-1) on the neighbour list. The RNG draw (arma_rnorm_vec) and the solve/chol/matmul results stay local: changing the rnorm path would shift the draw order, and arma move-allocates expression results on assignment so a member would not be reused anyway. Bit-identical (only buffer location changes); tests pass. The measured gain is small (~1-2%, largest at small p where the alloc/flop ratio is highest) since the row cost is dominated by chol(M) and the dense outer-product Sigma updates, not allocation.

…s -> gibbs d3ca987 added "Gibbs" to validate_sampler's match.arg choices (now 3) but left bgmCompare's default update_method = c("nuts","adaptive-metropolis") (2). match.arg only falls through to the first element when the passed vector is identical to choices, so bgmCompare's default errored — failing the comparison.Rmd vignette and the R-CMD-check / test-coverage / pkgdown CI jobs. Fixes: - Gate the gibbs kernel on is_continuous in validate_sampler (gibbs is the GGM row-block kernel only; there is no continuous bgmCompare, so this excludes it), and take the first element of a passed multi-value default explicitly instead of relying on match.arg's fragile default-equals-choices fall-through. An explicit update_method = "gibbs" on a non-continuous model is now correctly rejected. - Rename the user-facing value "Gibbs" -> "gibbs" for consistency with "nuts" and "adaptive-metropolis" (the within_step_kind = "row_block_gibbs" internal name is unchanged). Done before any release, so no deprecation needed. - Document the gibbs option in bgm()'s update_method (it was never listed); regenerate man/bgm.Rd.

MaartenMarsman added 14 commits May 19, 2026 13:54

MaartenMarsman changed the title ~~feat(ggm): log Z(G) closed-form primitives for hierarchical-spec MH~~ Draft - DEGORD - RR May 21, 2026

MaartenMarsman added 13 commits May 22, 2026 07:35

Merge remote-tracking branch 'origin/main' into feat/log-z-nlo-gamma

679a79f

# Conflicts: # R/RcppExports.R # src/RcppExports.cpp

style: bgm.R styler pass

d5b3a2d

test(bgm-delta): suppress progress bars + styler pass

26311a8

MaartenMarsman added 30 commits May 28, 2026 13:38

Revert "chore: remove dead warning_once helper"

12a9191

This reverts commit 80bb195.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft - DEGORD - RR#109

Draft - DEGORD - RR#109
MaartenMarsman wants to merge 77 commits into
mainfrom
feat/log-z-nlo-gamma

MaartenMarsman commented May 19, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MaartenMarsman commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MaartenMarsman commented May 19, 2026 •

edited

Loading

codecov Bot commented May 19, 2026 •

edited

Loading