Skip to content

Authentication & Opik

NASDE auto-detects the required credentials based on the variant’s agent type.

The tool checks for auth tokens in this order:

  1. ANTHROPIC_API_KEY environment variable
  2. CLAUDE_CODE_OAUTH_TOKEN environment variable

On macOS, you can extract the OAuth token from your Claude Code keychain entry (created when you log in via claude CLI):

Terminal window
source scripts/export_oauth_token.sh
# ✓ CLAUDE_CODE_OAUTH_TOKEN exported (sk-ant-oat01-...)

This lets you use your Claude Pro/Max subscription instead of an API key.

Codex variants support two authentication methods:

Option 1: ChatGPT subscription (OAuth) — uses your ChatGPT Plus/Pro/Business plan credits, not API billing.

Terminal window
codex login # authenticate via ChatGPT (one-time)
source scripts/export_codex_oauth_token.sh # validate tokens are present
uv run nasde run --variant codex-vanilla -C my-benchmark

When no API key is set, NASDE auto-detects the presence of ~/.codex/auth.json (created by codex login) and opts into uploading it to the sandbox (it sets CODEX_FORCE_AUTH_JSON=true; Harbor does the actual upload). No env vars needed.

Option 2: API key — billed per-token through your OpenAI Platform account.

Terminal window
export CODEX_API_KEY=sk-... # preferred
# or: export OPENAI_API_KEY=sk-...

API key always takes priority over OAuth when both are present.

Gemini CLI variants support three authentication methods:

Option 1: API key (Google AI Studio) — billed per-token through your Google AI Studio account.

Terminal window
export GEMINI_API_KEY=your-key

Option 2: Google Cloud / Vertex AI — uses your Google Cloud project billing. Set either an API key or a service-account credentials file:

Terminal window
export GOOGLE_API_KEY=your-key
# or, for a service account:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

These are the env vars NASDE checks for the API-key path (alongside GEMINI_API_KEY).

Option 3: OAuth (Google account) — uses your Gemini subscription credits.

Terminal window
gemini login # authenticate via Google account (one-time)
source scripts/export_gemini_oauth_token.sh # validate tokens are present
uv run nasde run --variant gemini-baseline -C my-benchmark

NASDE auto-detects ~/.gemini/oauth_creds.json and injects the credentials into the sandbox. No env vars needed.

API key env vars (GEMINI_API_KEY, GOOGLE_API_KEY, GOOGLE_APPLICATION_CREDENTIALS) always take priority over OAuth when present.

For Opik tracing, set credentials in .env (in project dir or parent):

OPIK_API_KEY=...
OPIK_WORKSPACE=...

The Opik project name is automatically set to the benchmark name (from nasde.toml [project] name).

After a --with-opik run, confirm the feedback scores landed:

import urllib.request, json
req = urllib.request.Request(
"https://www.comet.com/opik/api/v1/private/traces?project_name=<PROJECT>&limit=1",
headers={
"authorization": "<OPIK_API_KEY>",
"Comet-Workspace": "<WORKSPACE>",
},
)
resp = json.loads(urllib.request.urlopen(req).read())
scores = resp["content"][0].get("feedback_scores", [])
for s in sorted(scores, key=lambda x: x["name"]):
print(f" {s['name']}: {s['value']}")

Expected feedback scores after a full run with --with-opik:

  • arch_<dimension> (e.g. arch_domain_modeling) — normalized 0.0-1.0, plus arch_<dimension>_std
  • arch_total — overall architecture score, plus arch_total_std
  • eval_n — how many judge evaluations the mean is over
  • reward — Harbor rough-test result (0.0 or 1.0)
  • duration_sec — trial duration