From a850cc5938d71843064d9e717d572bfc527424cb Mon Sep 17 00:00:00 2001 From: Chen Linxuan Date: Mon, 29 Jun 2026 22:36:21 +0800 Subject: [PATCH] feat(tags): resolve project root from a workspace marker file Project identity is normally derived from the enclosing git repository (git-common-dir, then remote origin URL, then the directory path), which splits memory into one shard per physical git repo. That is wrong for multi-repo workspaces managed by orchestrators such as Google `repo` that lay out many git repositories under a single workspace root: each sub-repository gets its own isolated memory store instead of sharing one project-wide store. The obvious fix is a manual override, but carrying that override in an environment variable or a global config value is unsound here. opencode-mem runs across multiple opencode processes that share a single web server and a single storage database, and only some of those processes carry a given env var -- the one launched from a direnv shell does; the long-lived daemon launched by systemd, and sessions started elsewhere, do not. The web server is a singleton whose owner is non-deterministic: processes detect EADDRINUSE and take it over on a health-check loop, so project identity flaps depending on which process happens to own the port. A process-scoped value cannot back a shared identity. Make identity directory-driven instead. Drop an empty `.opencode-mem-project` marker file at the workspace root; every session started anywhere underneath then resolves onto that root. The marker is looked up by walking up from the working directory that every code path already passes in (the plugin's ctx.directory, the web API's process.cwd()), so identity is bound to where the session runs, not to which process it runs in -- stable across the whole process pool regardless of env vars or web-server ownership. The walk lives in getProjectRoot/getProjectIdentity, the lowest-level entry points, so plugin load, auto-capture, user-profile learning, compaction and the web API all pick it up without per-callsite changes. The marker takes precedence over git detection; when it hits, the underlying sub-repo's git remote is intentionally left unset, since it would describe only one nested repository and be misleading for the grouped workspace. Without a marker, behaviour is unchanged (the existing worktree and nested-path tests still pass). The marker lookup is resolved exactly once per getProjectTagInfo call: the git-only fallbacks are factored into private helpers so root and identity derive from a single ancestry walk instead of three. Tests cover the collapse of sibling git repos onto one marker root, deep nested resolution, the marker winning over an inner git repo and dropping its remote, the innermost-marker-wins case, the null case, and the backward- compatible no-marker behaviour. The marker is documented in the README, including why the env-var/config approach was rejected. Typical usage at the workspace root: touch ~/my-workspace/.opencode-mem-project Assisted-by: opencode:glm-5.2 Signed-off-by: Chen Linxuan --- README.md | 40 ++++++++++++++++++ src/services/tags.ts | 66 ++++++++++++++++++++++++++--- tests/project-scope.test.ts | 84 ++++++++++++++++++++++++++++++++++++- 3 files changed, 183 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index a3f4b2f..6d37d1b 100644 --- a/README.md +++ b/README.md @@ -109,6 +109,46 @@ Configure at `~/.config/opencode/opencode-mem.jsonc`: - `scope: "all-projects"`: query `search` / `list` across all project shards. - `memory.defaultScope` sets the default query scope when no explicit scope is provided. +### Sharing One Project Memory Across Nested Repos + +By default a project is identified by its enclosing git repository, so every +physical git repo gets its own isolated memory store. That is wrong for +multi-repo workspaces — trees managed by Google [`repo`](https://gerrit.googlesource.com/git-repo/+/HEAD/Docs/manual-repo.md), +monorepos, or any layout where several nested git repositories belong to one +logical project — because each sub-repository would be siloed. + +Drop an empty **`.opencode-mem-project`** marker file at the workspace root: + +``` +my-workspace/ +├── .opencode-mem-project ← workspace root +├── kernel/ (own git repo) +├── userspace/ (own git repo) +└── tools/ (own git repo) +``` + +Every session started anywhere underneath the marker then resolves onto that +root and shares one memory store, regardless of which sub-repo the working +directory lives in: + +```sh +touch ~/my-workspace/.opencode-mem-project +``` + +The marker is looked up by walking up from the working directory that every +code path already passes in (the plugin's working directory, the web API's +`process.cwd()`), so identity is **directory-driven and process-independent**. +It does not rely on environment variables or a global config value, which +would be unreliable here: opencode-mem runs across multiple opencode processes +that share a single web server, and only some of those processes carry a +given env var. With the marker, the project root is always derived from where +the session actually runs. + +The marker takes precedence over git detection. When it is present, the +sub-repo's own git remote is intentionally ignored (it would describe only one +nested repository). Without a marker, behavior is unchanged (git-based +identity). + ### Auto-Capture AI Provider **Recommended:** Use any provider that is already authenticated in opencode (no separate API key needed in this plugin): diff --git a/src/services/tags.ts b/src/services/tags.ts index d9722cc..4b89368 100644 --- a/src/services/tags.ts +++ b/src/services/tags.ts @@ -1,13 +1,50 @@ import { createHash } from "node:crypto"; import { execSync } from "node:child_process"; import { CONFIG } from "../config.js"; -import { normalize, resolve, isAbsolute, basename, dirname } from "node:path"; +import { normalize, resolve, isAbsolute, basename, dirname, join } from "node:path"; import { realpathSync, existsSync } from "node:fs"; function sha256(input: string): string { return createHash("sha256").update(input).digest("hex").slice(0, 16); } +/** + * Marker file whose presence pins a directory as the opencode-mem project root. + * + * A multi-repo workspace (e.g. a tree managed by Google `repo`, a monorepo, + * or any layout where several nested git repositories should share one memory + * store) drops this file at the workspace root. Every session started anywhere + * underneath then resolves onto that root instead of onto whichever physical + * git repository the working directory happens to live in. + * + * Unlike an environment variable or a config-file value, the marker is found + * by walking up from the working directory that every code path already passes + * in (the plugin's `ctx.directory`, the web API's `process.cwd()`), so project + * identity never depends on which long-lived opencode process happens to own + * the shared web server. + */ +const PROJECT_MARKER = ".opencode-mem-project"; + +/** + * Walk up from `directory` (inclusive) to the filesystem root looking for the + * {@link PROJECT_MARKER}. Returns the first directory that contains it, or + * `null` when no marker is found so the caller can fall back to git detection. + */ +export function findMarkerProjectRoot(directory: string): string | null { + let dir = resolve(directory); + while (true) { + if (existsSync(join(dir, PROJECT_MARKER))) { + return dir; + } + const parent = dirname(dir); + if (parent === dir) { + break; + } + dir = parent; + } + return null; +} + export interface TagInfo { tag: string; displayName: string; @@ -94,7 +131,9 @@ export function getGitTopLevel(directory: string): string | null { } } -export function getProjectRoot(directory: string): string { +// Git-only fallbacks, kept separate so the marker-aware entry points below +// can short-circuit on a marker and reuse these without re-running detection. +function getGitProjectRoot(directory: string): string { const commonDir = getGitCommonDir(directory); if (commonDir && basename(commonDir) === ".git") { return dirname(commonDir); @@ -108,7 +147,7 @@ export function getProjectRoot(directory: string): string { return directory; } -export function getProjectIdentity(directory: string): string { +function getGitProjectIdentity(directory: string): string { const commonDir = getGitCommonDir(directory); if (commonDir) { return `git-common:${commonDir}`; @@ -122,6 +161,15 @@ export function getProjectIdentity(directory: string): string { return `path:${normalize(directory)}`; } +export function getProjectRoot(directory: string): string { + return findMarkerProjectRoot(directory) ?? getGitProjectRoot(directory); +} + +export function getProjectIdentity(directory: string): string { + const markerRoot = findMarkerProjectRoot(directory); + return markerRoot ? `path:${markerRoot}` : getGitProjectIdentity(directory); +} + export function getProjectName(directory: string): string { const normalized = normalize(directory).replace(/\\/g, "/"); const parts = normalized.split("/").filter((p) => p && p !== "."); @@ -151,10 +199,16 @@ export function getUserTagInfo(): TagInfo { } export function getProjectTagInfo(directory: string): TagInfo { - const projectRoot = getProjectRoot(directory); + // Resolve the marker exactly once and derive root + identity from it, so a + // single getProjectTagInfo call never walks the ancestry more than once. + const markerRoot = findMarkerProjectRoot(directory); + const projectRoot = markerRoot ?? getGitProjectRoot(directory); const projectName = getProjectName(projectRoot); - const gitRepoUrl = getGitRepoUrl(directory); - const projectIdentity = getProjectIdentity(projectRoot); + // When a marker pins the project root, any git remote belongs to a single + // nested sub-repo and would be misleading for the grouped workspace, so + // leave it unset. + const gitRepoUrl = markerRoot ? null : getGitRepoUrl(directory); + const projectIdentity = markerRoot ? `path:${markerRoot}` : getGitProjectIdentity(projectRoot); return { tag: `${CONFIG.containerTagPrefix}_project_${sha256(projectIdentity)}`, diff --git a/tests/project-scope.test.ts b/tests/project-scope.test.ts index f2b2a46..35410c4 100644 --- a/tests/project-scope.test.ts +++ b/tests/project-scope.test.ts @@ -3,7 +3,7 @@ import { mkdtempSync, rmSync, writeFileSync, mkdirSync } from "node:fs"; import { basename, join } from "node:path"; import { tmpdir } from "node:os"; import { execSync } from "node:child_process"; -import { getProjectTagInfo } from "../src/services/tags.js"; +import { findMarkerProjectRoot, getProjectTagInfo } from "../src/services/tags.js"; const createdDirs: string[] = []; @@ -75,3 +75,85 @@ describe("project scope identity", () => { expect(rootTag.projectPath).toBe(nestedTag.projectPath); }); }); + +describe("project marker (.opencode-mem-project)", () => { + // Build a workspace containing several independent git repositories, like a + // tree managed by Google `repo` or a monorepo checkout. Without a marker + // each sub-repo is its own project; with one they all collapse onto the + // workspace root. + function createMultiRepoWorkspace(): { + workspaceDir: string; + repoA: string; + repoB: string; + } { + const workspaceDir = mkdtempSync(join(tmpdir(), "opencode-mem-ws-")); + createdDirs.push(workspaceDir); + const repoA = join(workspaceDir, "repo-a"); + const repoB = join(workspaceDir, "repo-b"); + for (const repo of [repoA, repoB]) { + mkdirSync(repo, { recursive: true }); + run("git init", repo); + run("git config user.email test@example.com", repo); + run("git config user.name Test User", repo); + } + return { workspaceDir, repoA, repoB }; + } + + it("without a marker, sibling git repos get separate project tags", () => { + const { repoA, repoB } = createMultiRepoWorkspace(); + + expect(getProjectTagInfo(repoA).tag).not.toBe(getProjectTagInfo(repoB).tag); + }); + + it("collapses nested git repos onto the marker root", () => { + const { workspaceDir, repoA, repoB } = createMultiRepoWorkspace(); + writeFileSync(join(workspaceDir, ".opencode-mem-project"), ""); + + const rootTag = getProjectTagInfo(workspaceDir); + const aTag = getProjectTagInfo(repoA); + const bTag = getProjectTagInfo(repoB); + + expect(aTag.tag).toBe(rootTag.tag); + expect(bTag.tag).toBe(rootTag.tag); + expect(aTag.tag).toBe(bTag.tag); + expect(aTag.projectPath).toBe(workspaceDir); + expect(aTag.projectName).toBe(basename(workspaceDir)); + }); + + it("resolves deep nested paths up to the marker root", () => { + const { workspaceDir, repoA } = createMultiRepoWorkspace(); + writeFileSync(join(workspaceDir, ".opencode-mem-project"), ""); + const deep = join(repoA, "src", "features", "memory"); + mkdirSync(deep, { recursive: true }); + + const deepTag = getProjectTagInfo(deep); + const rootTag = getProjectTagInfo(workspaceDir); + + expect(deepTag.tag).toBe(rootTag.tag); + expect(deepTag.projectPath).toBe(workspaceDir); + }); + + it("the marker wins over an inner git repo and drops its remote url", () => { + const { workspaceDir, repoA } = createMultiRepoWorkspace(); + // Give the inner repo a remote so we can assert it is intentionally ignored + // once the workspace marker takes over identity. + run("git remote add origin https://example.com/repo-a.git", repoA); + writeFileSync(join(workspaceDir, ".opencode-mem-project"), ""); + + const tag = getProjectTagInfo(repoA); + + expect(tag.projectPath).toBe(workspaceDir); + expect(tag.gitRepoUrl).toBeUndefined(); + }); + + it("findMarkerProjectRoot returns null without a marker, the ancestor when present", () => { + const { workspaceDir, repoA } = createMultiRepoWorkspace(); + + expect(findMarkerProjectRoot(repoA)).toBeNull(); + + writeFileSync(join(workspaceDir, ".opencode-mem-project"), ""); + expect(findMarkerProjectRoot(repoA)).toBe(workspaceDir); + // A session started exactly at the marker root still resolves to itself. + expect(findMarkerProjectRoot(workspaceDir)).toBe(workspaceDir); + }); +});