Run safe-outputs MCP in the gh-aw node container by Copilot · Pull Request #39100 · github/gh-aw

Copilot · 2026-06-13T15:42:29Z

This changes safe-outputs from a separately started HTTP sidecar into a first-class MCP server entry. The generated workflow now runs safe-outputs inside the gh-aw-node container with write access to the workspace, safe-outputs runtime files, and the safe-outputs log directory.

What changed
- Replaced the generated safe-outputs HTTP server config with a containerized stdio MCP server definition.
- Added a dedicated published container constant for ghcr.io/github/gh-aw-node.
- Mounted the required writable paths directly into the MCP server:
  - workspace
  - ${RUNNER_TEMP}/gh-aw/safeoutputs
  - /tmp/gh-aw/mcp-logs/safeoutputs
Workflow generation
- Removed the dedicated “generate safe-outputs server config” / “start safe-outputs HTTP server” flow.
- Kept safe-outputs runtime file generation (config.json, tools.json) but made those files inputs to the MCP container instead of a host-side server process.
- Dropped the now-unused safe-outputs port/API-key wiring from generated env and gateway setup.
MCP server model
- Safe-outputs is now emitted alongside the rest of the MCP server list instead of being bootstrapped out-of-band.
- The generated MCP entry passes through the runtime env needed by the JS implementation (workspace, runner temp, safe-outputs config/tools paths, log dir, repo context).

Generated shape

{
  "safeoutputs": {
    "container": "ghcr.io/github/gh-aw-node",
    "mounts": [
      "${GITHUB_WORKSPACE}:${GITHUB_WORKSPACE}:rw",
      "${RUNNER_TEMP}/gh-aw/safeoutputs:${RUNNER_TEMP}/gh-aw/safeoutputs:rw",
      "/tmp/gh-aw/mcp-logs/safeoutputs:/tmp/gh-aw/mcp-logs/safeoutputs:rw"
    ],
    "args": ["-w", "${GITHUB_WORKSPACE}"]
  }
}

Related updates
- Updated renderer and setup-generation expectations to reflect stdio container transport instead of HTTP transport.
- Refreshed the affected golden outputs and focused MCP/safe-outputs workflow expectations.

Changeset\n\n- Type: patch\n- Description: Run safe-outputs as a containerized MCP server inside the runtime instead of a separate HTTP sidecar.

Generated by 📋 Changeset Generator for issue #39100 · 13.6 AIC · ⊞ 14.8K · ◷

✨ PR Review Safe Output Test - Run 27471836462

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 84.9 AIC · ⌖ 7.94 AIC · ⊞ 8.1K · ◷

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-06-13T16:05:34Z

💫 TO BE CONTINUED... Smoke Claude failed to deliver outputs! Our hero faces unexpected challenges...

github-actions · 2026-06-13T16:05:36Z

🚀 Smoke Antigravity MISSION COMPLETE! Antigravity has spoken. ✨

github-actions · 2026-06-13T16:05:37Z

❌ Design Decision Gate 🏗️ failed to deliver outputs during design decision gate check.

github-actions · 2026-06-13T16:05:40Z

🧪 Test Quality Sentinel completed test quality analysis.

github-actions · 2026-06-13T16:05:40Z

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

github-actions · 2026-06-13T16:05:40Z

✅ All tools validated successfully! Agent Container Smoke Test confirms agent container is ready.

github-actions · 2026-06-13T16:05:42Z

⚠️ Smoke Gemini failed. Gemini encountered unexpected challenges...

github-actions · 2026-06-13T16:05:54Z

🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧

github-actions · 2026-06-13T16:06:08Z

🌑 The shadows whisper... Smoke Codex failed to deliver outputs. The oracle requires further meditation...

github-actions · 2026-06-13T16:06:25Z

📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing...

github-actions · 2026-06-13T16:06:30Z

📰 BREAKING: Smoke Copilot - AOAI (Entra) is now investigating this pull request. Sources say the story is developing...

github-actions · 2026-06-13T16:06:34Z

📰 BREAKING: Smoke Copilot - AOAI (apikey) is now investigating this pull request. Sources say the story is developing...

github-actions · 2026-06-13T16:09:28Z

Agent Container Tool Check

Tool	Status	Version
bash	✅	5.2.21
sh	✅	available
git	✅	2.54.0
jq	✅	1.7
yq	✅	4.53.3
curl	✅	8.5.0
gh	✅	2.93.0
node	✅	22.22.3
python3	✅	3.11.15 (PyPy 7.3.23)
go	✅	1.24.13
java	✅	21.0.11 (Temurin)
dotnet	✅	10.0.300

Result: 12/12 tools available ✅

Overall Status: PASS

🔧 Tool validation by Agent Container Smoke Test · 40.1 AIC · ⌖ 12.6 AIC · ⊞ 28.1K · ◷

github-actions · 2026-06-13T16:10:56Z

Warning

The push_to_pull_request_branch operation failed: SECURITY: Post-apply file-protection check failed. The patch applied files that were not detected by the pre-apply parser: .changeset/patch-containerize-safe-outputs-mcp-server.md. This may indicate a patch-parser bypass attempt. Aborting push.. The code changes were not applied.

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (233 new lines across pkg/ business directories) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/39100-run-safe-outputs-mcp-as-containerized-stdio-server.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next

Review the draft ADR committed to your branch — it was generated from the PR diff
Complete the missing sections — resolve the [TODO: verify] note on Alternative 2, refine the decision rationale, and confirm the real alternatives you considered
Commit the finalized ADR to docs/adr/ on your branch
Reference the ADR in this PR body by adding a line such as:

ADR: ADR-39100: Run safe-outputs MCP as a containerized stdio server

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say I will deal with this later. Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

Context — What is the problem? What forces are at play?
Decision — What did you decide? Why?
Alternatives Considered — What else could have been done?
Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 0042-use-postgresql.md for PR #42).

🔒 This gate blocks merge until an ADR is linked in the PR body.

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · 79 AIC · ⌖ 9.78 AIC · ⊞ 13.8K · ◷

github-actions · 2026-06-13T16:13:58Z

Smoke Test: Codex - 27471836485

rejig docs #1 GitHub reads: ✅
Add workflow: githubnext/agentics/weekly-research #2 Serena symbols: ✅
Add workflow: githubnext/agentics/weekly-research #3 Playwright page title: ✅
Add workflow: githubnext/agentics/weekly-research #4 Web fetch MCP: ❌
Overall: FAIL

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex · 85.6 AIC · ⊞ 10.6K · ◷

github-actions · 2026-06-13T16:13:58Z

Comment Memory

Silent caches wake
Builds and browser lights align
Smoke test leaves a trail

Note

This comment is managed by comment memory.

It stores persistent context for this thread in the code block at the top of this comment.
Edit only the text inside the backtick fences; workflow metadata and the footer are regenerated automatically.

Learn more about comment memory

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex · 85.6 AIC · ⊞ 10.6K · ◷

github-actions · 2026-06-13T16:14:45Z

💥 Smoke Test: Claude — Run 27471836462

Core #1-12: ✅✅✅✅✅✅✅✅✅✅✅✅
PR Review #13-19: 13✅ 14✅ 15✅ 16✅ 17✅ 18✅ 19⚠️(skipped)

Overall: PARTIAL (all run; #19 close-PR skipped — no safe test PR). Claude engine nominal! 🚀

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 84.9 AIC · ⌖ 7.94 AIC · ⊞ 8.1K · ◷

github-actions

💥 Automated smoke test review - all systems nominal!

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · 84.9 AIC · ⌖ 7.94 AIC · ⊞ 8.1K

github-actions · 2026-06-13T16:14:47Z

 // Using node:lts-alpine provides the latest LTS version with minimal footprint
 const DefaultNodeAlpineLTSImage = "node:lts-alpine"

+// DefaultGhAwNodeImage is the published gh-aw Node container image used for


Nice — documenting DefaultGhAwNodeImage with the rationale for needing Node.js + git is helpful for future maintainers. Consider noting the expected tag/version policy here too.

github-actions · 2026-06-13T16:14:47Z

 const DefaultWorkspaceMount = "\\${GITHUB_WORKSPACE}:\\${GITHUB_WORKSPACE}:rw"

+// DefaultSafeOutputsMount is the mount path for safe-outputs runtime files
+// (config.json, tools.json, outputs.jsonl, upload-artifacts/) in the runner temp directory.


The safe-outputs mount constants are clear. Good use of GhAwRootDirShell for consistency across mounts.

github-actions

Skills-Based Review 🧠

Applied /grill-with-docs, /improve-codebase-architecture, /tdd, and /zoom-out — commenting with improvement suggestions; no blocking correctness issues found.

📋 Key Themes & Highlights

Key Themes

Stale comments: The renderSafeOutputsTOML function comment still says "HTTP transport instead of stdio" — the exact opposite of what the PR implements.
Test name drift: TestSafeOutputsAPIKeyImmediateMasking was not renamed despite its body and doc comment being completely rewritten to test the absence of an API key.
Dual sources of truth for env vars: TOML renderer hardcodes a raw env_vars string; JSON renderer uses a typed struct slice. They agree today across 14 vars, but will silently diverge on the next addition.
Untagged container image: ghcr.io/github/gh-aw-node with no version tag contrasts with the gateway pin (v0.3.25). Worth aligning policy or documenting the intent.
Unexplained escaping asymmetry in TOML: args uses $GITHUB_WORKSPACE while mounts (via DefaultWorkspaceMount) uses \${GITHUB_WORKSPACE} — a brief comment would prevent future confusion.

Positive Highlights

✅ Clean removal of ~53 lines of HTTP server bootstrapping (port, API key, bash export block) — significant complexity reduction.
✅ RUNNER_TEMP is now correctly threaded through to the gateway docker command (golden file updates confirm this).
✅ buildSafeOutputsConfigRuntimeEnvVars / buildSafeOutputsConfigRuntimeData are still used for config.json generation — the regression concern (dynamic GH_AW_INPUT_* vars) is correctly handled by pre-resolving into config.json before the container starts.
✅ New mount constants (DefaultSafeOutputsMount, DefaultSafeOutputsLogMount) are well-named and follow existing conventions.
✅ Tests properly assert the new behaviour (container image, mounts, absence of HTTP artefacts) and the renamed TestSafeOutputsMCPContainerUsesRuntimePaths is a good improvement.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 585.4 AIC · ⌖ 40.9 AIC · ⊞ 29.4K

github-actions · 2026-06-13T16:16:20Z

-	yaml.WriteString("          \n")
-	yaml.WriteString("          [mcp_servers." + constants.SafeOutputsMCPServerID.String() + ".headers]\n")
-	yaml.WriteString("          Authorization = \"$GH_AW_SAFE_OUTPUTS_API_KEY\"\n")
+	yaml.WriteString("          container = \"" + constants.DefaultGhAwNodeImage + "\"\n")


[/grill-with-docs] Stale function comment one line above: // Now uses HTTP transport instead of stdio, similar to mcp-scripts now says the exact opposite of what the function does.

💡 Suggested fix

Change the comment to:

// renderSafeOutputsTOML generates Safe Outputs MCP configuration in TOML format. // Uses containerized stdio transport running in the published gh-aw node image.

Left unchanged, this misleads the next reader into thinking this function configures HTTP, which is the old approach being replaced.

github-actions · 2026-06-13T16:16:20Z

-// is masked immediately after generation, before any other operations.
+// TestSafeOutputsNoLongerGeneratesHTTPAPIKey verifies that safe-outputs no longer
+// creates an HTTP server API key now that it runs as a stdio MCP container.
 func TestSafeOutputsAPIKeyImmediateMasking(t *testing.T) {


[/tdd] Test name TestSafeOutputsAPIKeyImmediateMasking no longer describes what it tests. The updated doc comment above says it now verifies the absence of an HTTP API key, but the function name still implies it is testing masking behaviour.

💡 Suggested rename

func TestSafeOutputsNoLongerGeneratesHTTPAPIKey(t *testing.T) {

Test names are the first thing a failing CI shows. A reader hitting a failure in TestSafeOutputsAPIKeyImmediateMasking will look for API-key masking logic that no longer exists.

github-actions · 2026-06-13T16:16:20Z

+	yaml.WriteString("          container = \"" + constants.DefaultGhAwNodeImage + "\"\n")
+	yaml.WriteString("          mounts = [\"" + constants.DefaultWorkspaceMount + "\", \"" + constants.DefaultSafeOutputsMount + "\", \"" + constants.DefaultSafeOutputsLogMount + "\"]\n")
+	yaml.WriteString("          args = [\"-w\", \"$GITHUB_WORKSPACE\"]\n")
+	yaml.WriteString("          env_vars = [\"DEBUG\", \"DEFAULT_BRANCH\", \"GH_AW_ASSETS_ALLOWED_EXTS\", \"GH_AW_ASSETS_BRANCH\", \"GH_AW_ASSETS_MAX_SIZE_KB\", \"GH_AW_MCP_LOG_DIR\", \"GH_AW_SAFE_OUTPUTS\", \"GH_AW_SAFE_OUTPUTS_CONFIG_PATH\", \"GH_AW_SAFE_OUTPUTS_TOOLS_PATH\", \"GITHUB_REPOSITORY\", \"GITHUB_SERVER_URL\", \"GITHUB_TOKEN\", \"GITHUB_WORKSPACE\", \"RUNNER_TEMP\"]\n")


[/improve-codebase-architecture] The TOML renderer hardcodes env_vars as a raw string literal, while the JSON renderer (renderSafeOutputsMCPConfigWithOptions) drives the same list from a typed struct slice. These two sources currently agree on 14 vars, but they will drift the moment someone adds a var to one and forgets the other.

💡 Suggested approach

Extract a shared slice in mcp_renderer_builtin.go:

var safeOutputsEnvVarNames = []string{ "DEBUG", "DEFAULT_BRANCH", "GH_AW_ASSETS_ALLOWED_EXTS", "GH_AW_ASSETS_BRANCH", "GH_AW_ASSETS_MAX_SIZE_KB", "GH_AW_MCP_LOG_DIR", "GH_AW_SAFE_OUTPUTS", "GH_AW_SAFE_OUTPUTS_CONFIG_PATH", "GH_AW_SAFE_OUTPUTS_TOOLS_PATH", "GITHUB_REPOSITORY", "GITHUB_SERVER_URL", "GITHUB_TOKEN", "GITHUB_WORKSPACE", "RUNNER_TEMP", }

The TOML renderer builds its env_vars line from this slice; the JSON renderer iterates the same slice. A future add is then a single-line change with no risk of format-specific drift.

github-actions · 2026-06-13T16:16:20Z

+// DefaultGhAwNodeImage is the published gh-aw Node container image used for
+// JavaScript-based MCP servers that need Node.js and git plus workspace-mounted
+// repository scripts at runtime.
+const DefaultGhAwNodeImage = "ghcr.io/github/gh-aw-node"


[/zoom-out] ghcr.io/github/gh-aw-node has no version tag, implying latest. The gateway image is pinned (ghcr.io/github/gh-aw-mcpg:v0.3.25). For a safety-critical MCP server that handles write operations (issues, PRs, assets), floating on latest means a silent image change could alter tool behaviour between runs.

💡 Options

Pin a version tag (preferred for production stability): "ghcr.io/github/gh-aw-node:v1.2.3" — same pattern as the gateway.

Document the intent if latest is deliberate (e.g. the image is always forward-compatible and the team controls both the image and this constant together).

Either way, a comment explaining the versioning policy would help reviewers of future PRs understand why this differs from the gateway pin.

github-actions · 2026-06-13T16:16:20Z

-	yaml.WriteString("          Authorization = \"$GH_AW_SAFE_OUTPUTS_API_KEY\"\n")
+	yaml.WriteString("          container = \"" + constants.DefaultGhAwNodeImage + "\"\n")
+	yaml.WriteString("          mounts = [\"" + constants.DefaultWorkspaceMount + "\", \"" + constants.DefaultSafeOutputsMount + "\", \"" + constants.DefaultSafeOutputsLogMount + "\"]\n")
+	yaml.WriteString("          args = [\"-w\", \"$GITHUB_WORKSPACE\"]\n")


[/grill-with-docs] The TOML args field uses unescaped $GITHUB_WORKSPACE, while the JSON args at line 241 always uses \${GITHUB_WORKSPACE} (backslash-escaped) regardless of engine. Meanwhile, TOML mounts uses the backslash-escaped DefaultWorkspaceMount constant. The asymmetry within TOML (escaped mounts, unescaped args) is not explained and could mislead the next person editing either renderer.

💡 Suggestion

Add a short inline comment documenting the intent:

// TOML: args expand shell vars directly; mounts use \${...} for gateway-side expansion. yaml.WriteString(" args = [\"-w\", \"$GITHUB_WORKSPACE\"]\n")

Or if the two renderers should use identical escaping for args, align them so the behaviour is consistent and auditable from a single place.

github-actions · 2026-06-13T16:16:54Z

Run safe-outputs MCP in the gh-aw node container
PASS — §27471858644
1✅ 2✅ 3✅ 4✅ 5✅
6✅ 7✅ 8✅ 9✅ 10✅
11✅ 12✅ 13✅ 14✅ 15✅
Author: app/copilot-swe-agent
Assignees: pelikhan, Copilot

Warning

Firewall blocked 5 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · 250.4 AIC · ⌖ 23.4 AIC · ⊞ 20.4K · ◷

github-actions · 2026-06-13T16:17:36Z

🧪 Test Quality Sentinel Report

✅ Test Quality Score: 83/100 — Excellent

Analyzed 12 test(s): 12 design, 0 implementation, 0 guideline violation(s). This PR migrates safe-outputs from HTTP transport to containerized stdio MCP; all tests faithfully track the new behavioral contracts.

📊 Metrics & Test Classification (12 tests analyzed)

Metric	Value
New/modified tests analyzed	12
✅ Design tests (behavioral contracts)	12 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	9 (75%)
Duplicate test clusters	0
Test inflation detected	Yes (context: migration/simplification PR — see notes)
🚨 Coding-guideline violations	0

Test Classification Details

Test	File	Classification	Notes
`TestSafeOutputsAPIKeyImmediateMasking`	`mcp_api_key_masking_test.go`	✅ Design	Negative: no HTTP config/server; positive: container present
`TestRenderSafeOutputsMCPConfigWithOptions`	`mcp_config_refactor_test.go`	✅ Design	Table-driven (Copilot & Claude); updated expected/unexpected strings
`TestRenderSafeOutputsMCPConfigTOML`	`mcp_config_refactor_test.go`	✅ Design	TOML rendering contract; asserts container fields, negates HTTP fields
`TestRenderSafeOutputsMCPConfigTOMLStableAcrossSandboxModes` (new)	`mcp_config_refactor_test.go`	✅ Design	3 subtests (nil, agent-enabled, agent-disabled); sandbox-invariant container config
`TestRenderSafeOutputsMCP_JSON_Copilot`	`mcp_renderer_test.go`	✅ Design	JSON contract: stdio type, container image, mounts, args, env vars
`TestRenderSafeOutputsMCP_JSON_Claude`	`mcp_renderer_test.go`	✅ Design	JSON contract: no type field, container, direct shell vars
`TestRenderSafeOutputsMCP_TOML`	`mcp_renderer_test.go`	✅ Design	TOML contract: container, mounts, args, env_vars; rejects HTTP fields
`TestSafeOutputsMCPContainerUsesRuntimePaths` (new)	`mcp_setup_generator_test.go`	✅ Design	Regression test (compile → read YAML); 9 assert/notContains with descriptive messages
`TestSafeOutputsDynamicAllowedRepos`	`safe_outputs_dynamic_allowed_repos_test.go`	✅ Design	Count threshold relaxed 2→1; env placeholder contract preserved
`TestSafeOutputsDynamicAllowedReposSecrets`	`safe_outputs_dynamic_allowed_repos_test.go`	✅ Design	Same relaxation for secrets env var propagation
`TestSafeOutputsMCPServerIntegration`	`safe_outputs_mcp_integration_test.go`	✅ Design	E2E integration: container, mounts, no dedicated HTTP server step, config file written
`TestSafeOutputsMCPServerWithCodexEngine`	`safe_outputs_mcp_integration_test.go`	✅ Design	Same contract for Codex/TOML engine variant

Language Support

Tests analyzed:

🐹 Go (*_test.go): 12 tests — unit (//go:build !integration) and integration (//go:build integration)
🟨 JavaScript (*.test.cjs, *.test.js): 0 tests changed in this PR

i️ Notes — No Action Required (2 items)

Test Inflation (mechanical penalty applied, migration context)

mcp_config_refactor_test.go (+39) vs mcp_config_builtin.go (+13) = 3.0×; mcp_setup_generator_test.go (+22) vs mcp_setup_generator.go (+5) = 4.4×. Both exceed the 2:1 threshold, triggering the −10 pt mechanical penalty. However, this is expected for a migration PR where production code is heavily simplified (mcp_setup_generator.go net: +5/−58) while tests update all assertions from HTTP to container format. Not a genuine quality concern.

`TestRenderSafeOutputsMCPConfigTOMLStableAcrossSandboxModes` — Consider a Clarifying Comment

All 3 subtests assert the same positive invariant (container present, mounts stable, no host rewriting). A brief inline comment explaining that sandbox-mode host rewriting used to vary this output would make the test's intent clearer for future readers.

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected. All 12 tests enforce behavioral contracts from the user's perspective.

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §27471832444

🧪 Test quality analysis by Test Quality Sentinel · 484.1 AIC · ⌖ 28.7 AIC · ⊞ 28.3K · ◷

github-actions

✅ Test Quality Sentinel: 83/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 12 tests enforce behavioral contracts. No coding-guideline violations detected.

github-actions · 2026-06-13T16:18:50Z

@copilot review all comments and address unresolved review feedback.
Fix the safe-outputs container env passthrough for dynamic placeholders, then rerun the smoke tests.

Generated by 👨‍🍳 PR Sous Chef · 50.2 AIC · ⌖ 1.01 AIC · ⊞ 17.4K · ◷

pelikhan · 2026-06-13T16:46:18Z

@copilot recompile all lock files

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-13T17:15:40Z

@copilot recompile all lock files

Done — recompiled all 246 lock files. Also fixed a strings.Index → strings.Cut lint issue in mcp_scripts_http_codex_test.go that was blocking the pre-push validation.

github-actions · 2026-06-13T20:52:44Z

@copilot review all comments and address unresolved review feedback.

Please resolve the open TOML escaping thread, then rebase/update from main and rerun checks.

Generated by 👨‍🍳 PR Sous Chef · 52.3 AIC · ⌖ 1.07 AIC · ⊞ 17.4K · ◷

pelikhan · 2026-06-13T22:00:30Z

@copilot merge main
Run pr-finisher skill

…e-outputs-mcp-server Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot · 2026-06-13T22:13:25Z

@copilot merge main
Run pr-finisher skill

Merged main into the branch (clean, no conflicts). All local checks pass (fmt, lint, test-unit). CI needs a human re-trigger before merge.

github-actions · 2026-06-13T22:58:35Z

``
CI needs a human re-trigger after the main merge; once it passes, this looks ready.

Generated by 👨‍🍳 PR Sous Chef · 42.1 AIC · ⌖ 1.02 AIC · ⊞ 17.4K · ◷

Containerize safe-outputs MCP server

ddb10e7

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI assigned Copilot and pelikhan Jun 13, 2026

Copilot AI changed the title ~~Configure safe-outputs MCP in gh-aw node container~~ Run safe-outputs MCP in the gh-aw node container Jun 13, 2026

Copilot created this pull request from a session on behalf of pelikhan June 13, 2026 15:42 View session

Copilot AI requested a review from pelikhan June 13, 2026 15:42

Copilot finished work on behalf of pelikhan June 13, 2026 15:42

pelikhan marked this pull request as ready for review June 13, 2026 16:04

Copilot AI review requested due to automatic review settings June 13, 2026 16:04

pelikhan added the smoke label Jun 13, 2026

Copilot started reviewing on behalf of pelikhan June 13, 2026 16:04 View session

github-actions Bot temporarily deployed to aoai-model June 13, 2026 16:05 Inactive

github-actions Bot temporarily deployed to aoai-model June 13, 2026 16:06 Inactive

This was referenced Jun 13, 2026

[aw] Smoke Antigravity produced no safe outputs #39104

Open

[aw] Smoke Pi produced no safe outputs #39105

Open

[aw] Smoke Gemini failed #39106

Open

Add changeset

0f12637

github-actions Bot mentioned this pull request Jun 13, 2026

[aw] Design Decision Gate 🏗️ failed #39108

Open

github-actions Bot mentioned this pull request Jun 13, 2026

Smoke Test: Claude - 27471836462 #39110

Closed

github-actions Bot added the smoke-claude label Jun 13, 2026

github-actions Bot reviewed Jun 13, 2026

View reviewed changes

This was referenced Jun 13, 2026

[aw] Smoke Codex has cache-memory miss misconfiguration #39112

Open

[aw] Smoke Copilot - AOAI (Entra) produced no safe outputs #39113

Open

github-actions Bot reviewed Jun 13, 2026

View reviewed changes

github-actions Bot mentioned this pull request Jun 13, 2026

Smoke Test: Copilot - 27471858644 #39115

Closed

github-actions Bot approved these changes Jun 13, 2026

View reviewed changes

Copilot started work on behalf of pelikhan June 13, 2026 17:05 View session

Recompile all lock files and fix strings.Cut lint issue

cc13039

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 13, 2026 17:15

github-actions Bot mentioned this pull request Jun 13, 2026

[PR Triage Report] PR Triage Report — 2026-06-13 (Run #27475627987) #39136

Open

github-actions Bot mentioned this pull request Jun 13, 2026

[Contribution Check Report] Contribution Check — 2026-06-13 #38997

Open

Copilot started work on behalf of pelikhan June 13, 2026 22:01 View session

Merge remote-tracking branch 'origin/main' into copilot/configure-saf…

f502a89

…e-outputs-mcp-server Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot finished work on behalf of pelikhan June 13, 2026 22:13

Conversation

Copilot AI commented Jun 13, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changeset\n\n- Type: patch\n- Description: Run safe-outputs as a containerized MCP server inside the runtime instead of a separate HTTP sidecar.

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Agent Container Tool Check

Uh oh!

github-actions Bot commented Jun 13, 2026

🏗️ Design Decision Gate — ADR Required

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Comment Memory

Uh oh!

github-actions Bot commented Jun 13, 2026

💥 Smoke Test: Claude — Run 27471836462

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Skills-Based Review 🧠

Key Themes

Positive Highlights

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Jun 13, 2026

Copilot AI commented Jun 13, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

github-actions Bot commented Jun 13, 2026 •

edited

Loading

`TestRenderSafeOutputsMCPConfigTOMLStableAcrossSandboxModes` — Consider a Clarifying Comment