Skip to content

Resolve RBAC deadlocks via sequential locking#1685

Open
ygndotgg wants to merge 2 commits into
parseablehq:mainfrom
ygndotgg:rbac_deadlock
Open

Resolve RBAC deadlocks via sequential locking#1685
ygndotgg wants to merge 2 commits into
parseablehq:mainfrom
ygndotgg:rbac_deadlock

Conversation

@ygndotgg

@ygndotgg ygndotgg commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Description

Resolves critical RwLock deadlocks in the in-memory RBAC maps (USERS, SESSIONS, ROLES, USER_GROUPS) that caused the server to freeze under concurrent user management and authorization requests.

Possible solutions and chosen one

A global lock-ordering hierarchy could have been implemented, but the chosen solution enforces a strict Core Invariant: a map lock may only read or mutate its own map. Cross-map logic is achieved by cloning lightweight data (e.g., the AuthSnapShot struct), dropping the current lock, and acquiring the next map sequentially. This minimizes lock hold times and completely eliminates circular lock dependencies.

Key changes made in the patch

  • Read Path: Replaced the nested-lock Sessions::check_auth with an AuthSnapShot pattern to drop the SESSIONS read lock before evaluating cross-map permissions.
  • Mutation Paths: Applied a "3-Step Drop" to add_roles, remove_roles, change_password_hash, put_user, and delete_user to ensure USERS is fully unlocked before touching SESSIONS or ROLES.
  • Aggregation Path: Refactored aggregate_group_permissions and roles_to_permission to explicitly drop USER_GROUPS and ROLES read locks between loop iterations.

This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • Refactor
    • Refactored authentication and authorization logic within the session management system to improve overall structure and efficiency.

ygndotgg added 2 commits June 17, 2026 13:41
Eliminated nested `RwLock` acquisitions by introducing `AuthSnapShot`
for read paths and sequential lock-dropping for user mutations.
Server remains deadlock-free under `quest-parallel` concurrent loads.
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

Refactors RBAC authorization across src/rbac/map.rs and src/rbac/mod.rs to use a pre-cloned AuthSnapShot struct for lock-free permission checks. Removes Sessions::check_auth, adds Sessions::auth_snapshot and check_auth_snapshot, updates user mutation methods to conditionally invalidate sessions, and revises the full authorize flow to use the new snapshot approach.

Changes

RBAC Snapshot Authorization Refactor

Layer / File(s) Summary
AuthSnapShot contract and check_auth_snapshot in map.rs
src/rbac/map.rs
Adds AuthSnapShot struct, check_auth_snapshot free function, and Sessions::auth_snapshot helper. Removes Sessions::check_auth. Refactors aggregate_group_permissions to use HashSet-based intermediates for user groups and group roles before expanding via RoleBuilder.
Conditional session invalidation in user mutation methods
src/rbac/mod.rs
put_user clones ids before insertion and removes stale sessions; delete_user scopes remove_user; change_password_hash, add_roles, and remove_roles now only invalidate or refresh session permissions when the underlying user data was actually mutated.
authorize and new_session updated to use snapshots
src/rbac/mod.rs
get_permissions derives group role names from read_user_groups and fetches privileges from the tenant role map; new_session precomputes permissions before track_new; authorize fast-path replaced with auth_snapshot + check_auth_snapshot; BasicAuth path tracks session, re-fetches snapshot, and validates via check_auth_snapshot; roles_to_permission inlines the role map lookup.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant authorize
  participant Sessions
  participant check_auth_snapshot
  participant aggregate_group_permissions

  Client->>authorize: request + SessionKey
  authorize->>Sessions: auth_snapshot(key)
  alt Session exists
    Sessions-->>authorize: AuthSnapShot { username, tenant_id, permissions }
    authorize->>check_auth_snapshot: snapshot, required_action, resource, user
    check_auth_snapshot->>aggregate_group_permissions: username, tenant_id
    aggregate_group_permissions-->>check_auth_snapshot: group_perms
    check_auth_snapshot-->>authorize: Response::Authorized / UnAuthorized
  else BasicAuth
    authorize->>authorize: verify password hash
    authorize->>Sessions: track_new(key, perms)
    authorize->>Sessions: auth_snapshot(key)
    Sessions-->>authorize: AuthSnapShot
    authorize->>check_auth_snapshot: snapshot, required_action, resource, user
    check_auth_snapshot-->>authorize: Response
  end
  authorize-->>Client: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 Hop hop, no lock to hold!
A snapshot taken, permissions polled,
The sessions freed from heavy chains,
check_auth gone, new logic reigns.
With sets and clones the roles align —
Lock-free auth, by grand design! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and accurately summarizes the main change: resolving RBAC deadlocks through sequential locking, which aligns with the core objective of the PR.
Description check ✅ Passed The description provides clear objectives, explains the chosen solution with rationale, and details key changes. While testing and documentation checkboxes remain unchecked, the core required content is substantially complete.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/rbac/mod.rs (1)

336-344: 💤 Low value

Narrow race window between session insertion and snapshot retrieval.

After track_new completes and the write lock is dropped (line 335), there's a brief window before the read lock is acquired (line 337) where another thread could theoretically remove the session. The .expect() on line 344 would panic in this case.

While this window is extremely narrow and would require a concurrent remove_user call for the same user being authorized, consider using .unwrap_or(Response::ReloadRequired) instead of .expect() for defensive resilience.

🛡️ Defensive alternative
-                return snapshot
-                    .map(|snap| {
-                        map::check_auth_snapshot(snap, action, context_stream, context_user)
-                    })
-                    .expect("entry for this key just added");
+                return snapshot
+                    .map(|snap| {
+                        map::check_auth_snapshot(snap, action, context_stream, context_user)
+                    })
+                    .unwrap_or(Response::ReloadRequired);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rbac/mod.rs` around lines 336 - 344, Replace the `.expect("entry for this
key just added")` call on the snapshot map chain with
`.unwrap_or(Response::ReloadRequired)` to defensively handle the rare race
condition where another thread might remove the session entry between the write
lock being dropped after track_new and the read lock being acquired in
sessions(). This prevents a panic if the session snapshot is no longer available
when retrieved.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/rbac/mod.rs`:
- Around line 336-344: Replace the `.expect("entry for this key just added")`
call on the snapshot map chain with `.unwrap_or(Response::ReloadRequired)` to
defensively handle the rare race condition where another thread might remove the
session entry between the write lock being dropped after track_new and the read
lock being acquired in sessions(). This prevents a panic if the session snapshot
is no longer available when retrieved.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c2e5ee68-0278-47f6-b02f-f9658734d92c

📥 Commits

Reviewing files that changed from the base of the PR and between 7d1d9a9 and caae924.

📒 Files selected for processing (2)
  • src/rbac/map.rs
  • src/rbac/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant