MCP stdio server respawned in an unbounded tight loop (no backoff / no max-retry) in 1.0.61

## Summary

After updating to **1.0.61**, a stdio MCP server began being spawned repeatedly in a tight loop — hundreds to thousands of child processes — without the CLI awaiting the prior process''s `initialize` handshake. There is no backoff, no max-retry cap, and no configuration knob to bound the behavior. This did not occur on the prior version.

## Environment

- Copilot CLI: **1.0.61** (Windows, WinGet `GitHub.Copilot`)
- OS: Windows 11
- Affected server type: **stdio** MCP server with a slow cold start

## Reproduction

1. Register a stdio MCP server whose startup is slow (multi-second cold start). In our case the launch shape was:
   ```json
   { "command": "cmd", "args": ["/c", "npx", "-y", "@scope/some-mcp-server"] }
   ```
   (`cmd -> npx -> node resolver -> node server`, plus an npm "resolve latest" network round-trip on every spawn.)
2. Start a session that exercises this server early/often.
3. Observe the CLI spawn a new server process roughly every ~60 ms without waiting for the previous one to complete its handshake.

## Observed

- ~8,300 spawn events in a single session; log file grew to **72 MB**.
- Deep process trees accumulate because each attempt is itself a multi-process tree that never gets a chance to finish initializing before the next attempt fires.
- Began immediately after the 1.0.61 update (binary timestamped the evening before the loop first appeared); not present on the prior release.

## Suspected root cause

A retry-on-MCP-startup behavior appears to have been introduced or changed in 1.0.61 that:

- does **not** await the in-flight process''s `initialize` response before retrying,
- has **no exponential backoff**,
- has **no maximum retry count**,
- exposes **no configuration** to tune any of the above (nothing in the changelog or MCP config docs).

When a server''s spawn latency exceeds the (apparently very short / zero) retry interval, retries pile up unbounded.

## Suggested fix

1. Await the prior spawn''s handshake (or a per-attempt timeout) before issuing a retry.
2. Add exponential backoff with a sane cap.
3. Add a max-retry ceiling, then surface a clear error instead of looping.
4. Optionally expose `maxRetries` / `retryDelayMs` / `backoff` in the MCP server config so operators can tune it.

## Workaround

Eliminated the slow spawn so the handshake wins the race: globally install + pin the package and launch its entry point directly (`node <path>/bin/server.js`) instead of `cmd /c npx -y ...`. Drops the launch from a 4-process tree + network check to a single process and stops the explosion — but this only masks the underlying unbounded-retry defect.

## Evidence available

72 MB session log with ~8,300 timestamped spawn events (redacted excerpt available on request).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP stdio server respawned in an unbounded tight loop (no backoff / no max-retry) in 1.0.61 #3782

Summary

Environment

Reproduction

Observed

Suspected root cause

Suggested fix

Workaround

Evidence available

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MCP stdio server respawned in an unbounded tight loop (no backoff / no max-retry) in 1.0.61 #3782

Description

Summary

Environment

Reproduction

Observed

Suspected root cause

Suggested fix

Workaround

Evidence available

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions