feat: add openai-compatible executor by davidhonig · Pull Request #315 · microsoft/waza

davidhonig · 2026-06-12T15:30:41Z

Summary

Adds an openai-compatible executor for running Waza evals against OpenAI-compatible Chat Completions APIs such as local LM Studio. The executor supports endpoint normalization, optional per-eval api_key, OPENAI_API_KEY fallback, workspace capture, usage mapping, and skill-context injection. .waza.yaml can now provide defaults.engine: openai-compatible and defaults.endpoint, so eval specs can stay minimal while using a local endpoint.

Related issue

Related to #66

Agent handoff

Scope: Add OpenAI-compatible executor support and project-level endpoint defaults.
Key files changed: internal/execution/openai_compatible.go, cmd/waza/cmd_run.go, internal/models/spec.go, internal/projectconfig/config.go, schemas/eval.schema.json, schemas/config.schema.json, README and site/ docs.
Important decisions: API keys are not stored in .waza.yaml; use OPENAI_API_KEY for auth. model is optional for openai-compatible and defaults to local-model. Skill context is injected using the existing skill-system-message path.
Follow-ups or known gaps: No streaming/tool-call support for OpenAI-compatible endpoints yet.

Type of change

Validation

go test ./...
make lint or golangci-lint run
Docs site checked, if docs changed
Web/dashboard checks run, if web/ changed
Manual validation completed: local LM Studio-compatible path verified with http://127.0.0.1:1234
Not applicable; reason:

Documentation

README updated, if user-facing behavior changed
site/ docs updated, if CLI, YAML, dashboard, or validator behavior changed
Examples updated, if relevant
Not applicable

Risk and rollback

Risk level: Medium
Rollback plan: Revert bb0fd568 to remove the new executor, schema fields, project defaults, and docs. Existing mock and copilot-sdk executor paths should remain unaffected.

Notes for reviewers

Please focus on internal/execution/openai_compatible.go request/response handling, skill-context injection, and .waza.yaml default application in cmd/waza/cmd_run.go. The intended local config is defaults.engine: openai-compatible plus defaults.endpoint: http://127.0.0.1:1234; auth should come from OPENAI_API_KEY.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an openai-compatible execution path so evaluations can run against OpenAI-compatible Chat Completions endpoints (e.g., local LM Studio), and documents/configures the new options.

Changes:

Introduces openai-compatible executor with endpoint normalization and API key handling (OPENAI_API_KEY fallback).
Extends JSON schemas + project defaults to support endpoint and api_key, and updates docs accordingly.
Adds unit/integration-style tests for schema validation, spec loading, engine execution, and waza run behavior.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
site/src/content/docs/reference/waza-yaml.mdx	Documents `defaults.endpoint` in `.waza.yaml`.
site/src/content/docs/reference/schema.mdx	Documents new `openai-compatible` executor and `endpoint`/`api_key`.
site/src/content/docs/reference/cli.mdx	Documents `OPENAI_API_KEY` env var behavior for the new executor.
site/src/content/docs/guides/eval-yaml.mdx	Updates eval YAML guide with `openai-compatible` options and semantics.
schemas/eval.schema.json	Extends eval schema for `openai-compatible`, makes `model` conditional.
schemas/config.schema.json	Extends project config schema with new engine and `defaults.endpoint`.
internal/validation/schema_test.go	Adds schema-validation test cases for `openai-compatible` configs.
internal/projectconfig/config_test.go	Validates new `defaults.endpoint` parsing/merging.
internal/projectconfig/config.go	Adds `Defaults.Endpoint` and merges it.
internal/models/spec_test.go	Adds tests for parsing `endpoint`/`api_key` into the spec model.
internal/models/spec.go	Adds `Config.Endpoint` and `Config.APIKey`.
internal/execution/openai_compatible_test.go	New tests covering endpoint normalization, auth, workspace cleanup, and skill-context injection.
internal/execution/openai_compatible.go	Implements the `OpenAICompatibleEngine`.
internal/execution/copilot.go	Refactors skill-dir selection into reusable helper for new engine.
cmd/waza/cmd_run_test.go	Adds `waza run` tests for `openai-compatible` and `.waza.yaml` defaults.
cmd/waza/cmd_run.go	Applies `.waza.yaml` defaults to eval specs and wires in `openai-compatible` engine creation.
README.md	Adds docs for `openai-compatible` executor and related env vars.

+	if cfg, err := projectconfig.Load(filepath.Dir(specPath)); err == nil && cfg != nil {
+		applyProjectDefaultsToSpec(spec, cfg)
+	}


+		workspaceDir, err = os.MkdirTemp("", "waza-openai-*")
+		if err != nil {
+			return nil, fmt.Errorf("failed to create openai-compatible workspace: %w", err)
+		}
+		if err := setupWorkspaceResources(workspaceDir, req.Resources); err != nil {
+			_ = os.RemoveAll(workspaceDir)
+			return nil, fmt.Errorf("failed to setup openai-compatible workspace resources: %w", err)
+		}
+		e.mu.Lock()
+		e.workspaces = append(e.workspaces, workspaceDir)
+		e.mu.Unlock()
+	}
+	if _, err := ResolveWorkDir(workspaceDir, req.WorkDir); err != nil {
+		return nil, err
+	}


+	case "openai-compatible":
+		engine, err = execution.NewOpenAICompatibleEngine(spec.Config.Endpoint, spec.Config.ModelID, spec.Config.APIKey)
+		if err != nil {
+			return nil, err
+		}


+	defaultedEndpoint := `name: test-eval
+skill: test-skill
+version: "1.0"
+config:
+  trials_per_task: 1
+  timeout_seconds: 60
+  executor: openai-compatible
+metrics:
+  - name: accuracy
+    weight: 1.0
+    threshold: 0.8
+tasks:
+  - "tasks/*.yaml"
+`
+	require.Empty(t, ValidateEvalBytes([]byte(defaultedEndpoint)), "openai-compatible should allow endpoint from .waza.yaml defaults")


feat: add openai-compatible executor

bb0fd56

Copilot AI review requested due to automatic review settings June 12, 2026 15:30

davidhonig requested a review from spboyer as a code owner June 12, 2026 15:30

github-actions Bot enabled auto-merge (squash) June 12, 2026 15:30

Copilot AI reviewed Jun 12, 2026

View reviewed changes

davidhonig mentioned this pull request Jun 12, 2026

Bring Your Own Model — Ollama, OpenAI, Anthropic, OpenCode engines #11

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add openai-compatible executor#315

feat: add openai-compatible executor#315
davidhonig wants to merge 1 commit into
microsoft:mainfrom
davidhonig:dh/add-openai-compatible-engine

davidhonig commented Jun 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davidhonig commented Jun 12, 2026

Summary

Summary

Related issue

Agent handoff

Type of change

Validation

Documentation

Risk and rollback

Notes for reviewers

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants