Skip to content

feat(multi-env): manifest provider + per-env tool family + system prompt block#5

Open
imryao wants to merge 33 commits into
mainfrom
feature/multi-environment
Open

feat(multi-env): manifest provider + per-env tool family + system prompt block#5
imryao wants to merge 33 commits into
mainfrom
feature/multi-environment

Conversation

@imryao
Copy link
Copy Markdown
Member

@imryao imryao commented May 7, 2026

Summary

End-to-end multi-environment support in our agentserver codex fork. 33 commits ahead of main. Lets a single codex session run shell/file ops against multiple execution environments declared via a JSON manifest, with per-environment auth tokens and an LLM-facing tool family that explicitly carries environment_id.

What landed (4 layers)

P1 — manifest loader (exec-server crate)

  • ManifestEnvironmentProvider reads CODEX_EXEC_SERVERS_JSON, validates entries (no dupe ids, default present), resolves per-entry bearer tokens from env vars
  • Environment::remote_with_auth(...) constructor; Environment.description: Option<String>
  • LazyRemoteExecServerClient::with_auth(...) injects Authorization: Bearer … on connect
  • Legacy CODEX_EXEC_SERVER_URL path preserved byte-identically; manifest wins when both set (with tracing::warn!)
  • Integration tests in exec-server/tests/manifest_provider.rs

P2 — env selection in core

  • TurnContext::select_environment(Option<&str>) -> Option<&TurnEnvironment> (None → primary)
  • UnifiedExecRequest.environment_id and ApplyPatchRequest.environment_id fields
  • Runtime call sites switched from primary_environment()select_environment(env_id)
  • intercept_apply_patch(env_id) lets shell→apply_patch chains carry env

P3 — schema honesty (REVERSAL from original plan)

Initial implementation added environment_id to native shell / apply_patch schemas. Reverted in commits 0d820d5961 / bbba7c6ac2 / ddeb163d17 / aefa1f7d95 to preserve upstream training compatibility — native tools must keep schemas the model has seen during pretraining. See `fix(core): shell schema honesty …` for rationale.

Replacement: a parallel `*_in_environment` tool family (gated on `env_count >= 2`):

  • `exec_command_in_environment`
  • `apply_patch_in_environment` (JSON only — Lark grammar can't express the field)
  • `list_environments` (read-only catalog)
  • `list_dir_in_environment`
  • `read_file_in_environment` / `write_file_in_environment`
  • `view_image_in_environment`

All file-system tools route through `Environment::get_filesystem()`, no direct `tokio::fs`.

P4 — system prompt block

  • New `AvailableEnvironmentsInstructions` fragment in `core/src/context/`
  • Renders markdown table (`id | description | default`) plus an inline list of the new tool family so LLM knows when/how to use them
  • Wired into `session/mod.rs` developer_sections, gated on `environments.len() >= 2`
  • Single-env sessions emit no block (zero prompt change)

Backwards compatibility

  • Native `shell` / `exec_command` / `apply_patch` schemas are byte-identical to upstream (training compat)
  • All new fields are `Option` with `#[serde(default)]`
  • `EnvironmentProvider::default_environment_id` has a default impl returning `None`
  • `Environment::remote_inner` retained as compat shim
  • Single-env users see no behavior change

Out of scope (known limitations)

  • `apply_patch` Lark/freeform variant has no `environment_id` (grammar limitation)
  • `multi_agents` / `agent_jobs` sub-agent dispatch still resolves to primary env only
  • MCP runtime environment selection is fixed to primary
  • Native `view_image` / `list_dir` etc. still target primary; cross-env access goes through the new family

Test Plan

Local verification (run from `/root/codex-multi-env/codex-rs` on this branch):

  • `cargo check -p codex-exec-server -p codex-core -p codex-tools -p codex-protocol` — clean (one pre-existing dead_code warning on `LazyRemoteExecServerClient::new`)
  • `cargo test -p codex-exec-server -p codex-tools -p codex-protocol` — 152 passed, 0 failed (includes manifest_provider integration, all new `*_in_environment` tool spec tests, required-field validation)
  • `RUST_MIN_STACK=16777216 cargo test -p codex-core --lib` for multi-env-touched modules: `tools::handlers::`, `session::turn_context`, `session::mcp`, `context::available_environments_instructions`, `tools::runtimes::`, `environment_selection`, `unified_exec` — all passing
  • `intercept_apply_patch_routes_by_environment_id` — passes (P3 reversal regression test)
  • `multi-env tool family end-to-end integration` — passes
  • CI to verify: 12 `codex-core --lib` test failures observed locally are all in cwd/git-context-sensitive modules (`turn_diff_tracker`, `config::config_loader_tests`, `git_info_tests::resolve_root_git_project_for_trust_returns_none_outside_repo`, `realtime_context::*`) — none in multi-env paths. Strongly suspected to be artifacts of running tests inside a nested git worktree (`/root/codex-multi-env`); need CI on a clean checkout to confirm.
  • CI: full `cargo test --workspace`
  • CI: clippy + fmt

Notes for reviewers

  • Commit history includes the P3 reverts intentionally — they document the reasoning for why we chose the parallel-tool approach over schema field. Worth preserving rather than squashing.
  • Upstream `origin/main` already has one multi-env commit (`Surface multi-environment choices in environment context openai/codex#20646`); this branch builds on top of it (already in our `main`).

🤖 Generated with Claude Code

mryao and others added 30 commits May 5, 2026 19:55
…wing

Pa.5 mirrors view_image but routes the image read to a non-default env's
ExecutorFileSystem::read_file. Schema requires environment_id + path;
no detail=original knob (re-plumbing the model capability gate would
risk silent divergence — added in a future spec when needed).

Output mirrors view_image: emits ImageView turn item, returns an
InputImage content block carrying the resized data URL with default
detail. Module is dead code until Pa.7 wires registration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pa.7. Registers the seven env-aware tools (exec_command_in_environment,
apply_patch_in_environment, list_environments, list_dir_in_environment,
view_image_in_environment, read_file_in_environment,
write_file_in_environment) so the LLM actually sees them.

Gating: ToolsConfig gains multi_environment_count (default 1). The new
tools are advertised only when has_environment is true and the count is
>= 2. With a single environment, env routing is meaningless and the
native tools cover the surface byte-identically to upstream.

Plumbing: TurnContext supplies the count from its TurnEnvironment list
(turn_context.rs and review.rs). EnvironmentManager gains a public
environments_count() accessor for callers that want to drive the gate
from the registry directly.

Dispatch: ToolHandlerKind grows seven new variants and spec.rs maps
each to its handler. Removes the #![allow(dead_code)] umbrella from
the seven handler modules and the #[allow(unused_imports)] from the
re-exports in tools/src/lib.rs and core/src/tools/handlers/mod.rs;
TOOL_NAME constants get per-item allows since runtime registration
uses string literals.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant