Adversarial Code Review

Platform: Claude Code only. This skill orchestrates Claude ↔ Codex interaction, where Claude is the executor and Codex is the external reviewer. Running this skill from Codex CLI itself creates a recursive loop — Codex would try to launch itself. If you are Codex — do NOT invoke this skill; perform the review directly.

Sends current work for adversarial review through an external AI model (OpenAI Codex by default). Auto-detects what to review: plan or code. Claude fixes issues based on reviewer feedback and resubmits until approved. Maximum 5 rounds.

When to invoke

/adversarial-review — auto-detect what to review
/adversarial-review plan — force plan review
/adversarial-review code — force code review
/adversarial-review <file-path> — review a specific file (argument contains / or .)
Override reasoning: /adversarial-review xhigh or /adversarial-review medium (one of: medium, high, xhigh)
Override model: /adversarial-review model:gpt-5.3-codex (argument with model: prefix)

Instructions

Placeholders: ${REVIEW_ID}, ${ATTEMPT_ID}, ${CODEX_SESSION_ID}, ${REPO_ROOT}, and ${BASE_BRANCH} in the steps below are template placeholders, NOT shell variables. Substitute literal values directly into each tool call. In particular:

${REPO_ROOT} is ALWAYS an absolute path captured at Step 2; never replace it with $(pwd).

${REVIEW_ID} is stable for the entire review (used in file paths).

${ATTEMPT_ID} is a fresh 6-digit random integer generated per launch — a new value for the initial exec, for any retry of that exec, for every resume in Step 7, and for any fresh-exec fallback. The combined marker ${REVIEW_ID}-${ATTEMPT_ID} is embedded in the prompt (HTML comment) so the filesystem session-id fallback identifies exactly THIS launch's rollout. Do NOT reuse a prior launch's ATTEMPT_ID — that would make multiple rollouts match and reintroduce silent session drift.

Step 1: Determine review mode

Determine what to review. Check in priority order:

1. Explicit argument (plan, code, file path) → use it.

For plan → skip all git checks, proceed to step 2 (REVIEW_ID only).

2. Claude Code Plan Mode — if context contains the system message "Plan mode is active" → mode = plan, skip git. In Plan Mode code is not edited, so code/code-vs-plan are impossible.

3. Auto-detect (no explicit argument, not in Plan Mode):

Check for code changes (any non-empty output means changes exist):
- git diff --name-only — unstaged
- git diff --cached --name-only — staged
- git diff --name-only ${BASE_BRANCH}...HEAD — branch commits
Check if a plan exists in the current conversation context (from plan mode, tasks, or discussion).

Code changes?	Plan in context?	Mode
No	Yes	plan — review the plan
Yes	Yes	code-vs-plan — review implementation against plan
Yes	No	code — review code changes
No	No	Ask the user what to review

Step 2: Generate Session ID, capture REPO_ROOT, determine base branch

REVIEW_ID: generate yourself, format {unix_timestamp}-{random_8digit_number}. Example: 1711872000-48217593. Do NOT use bash — substitute the value directly into commands in the following steps. 8-digit random makes collisions negligible (1 in 10^8 per same-second invocation).

Capture REPO_ROOT:

git rev-parse --show-toplevel

Exit 0, non-empty output → absolute path. Save literally as REPO_ROOT (a template placeholder — substitute verbatim into codex commands; do NOT use $(pwd) anywhere).
Exit 128 (bare repo, or not in a work tree) → tell the user: Cannot run adversarial review — current directory is not inside a git working tree. Abort the skill.
Path contains single quote, double quote, $, backtick, newline → tell the user: REPO_ROOT path contains shell-special characters; cannot safely construct codex commands. Abort.

Submodule warning: after capturing REPO_ROOT, run:

git rev-parse --show-superproject-working-tree

If this returns non-empty, the user is inside a git submodule. Tell the user: You are inside a submodule. The review will be scoped to this submodule (${REPO_ROOT}), not the parent repo. If you meant to review the parent, invoke from there. Proceed — this is a warning, not an abort.

Determining base branch (only for code and code-vs-plan modes):

For plan mode — skip base branch detection, proceed to step 3.

For other modes, determine the repository's base branch:

git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/origin/||'

If the command returns empty (remote HEAD not configured), use fallback:

git rev-parse --verify main 2>/dev/null && echo main || echo master

Save the result as BASE_BRANCH — used in git diff ${BASE_BRANCH}...HEAD below.

Step 3: Prepare review material

Plan review:

If the plan already exists as a file (in project/, plan file from Plan Mode, memory, or somewhere in the repo) — use the path directly. Do NOT copy. In Claude Code Plan Mode the plan is always a file.
If the plan is only in the conversation context (outside Plan Mode) — write via Write tool to /tmp/codex-plan-${REVIEW_ID}.md.
Always print the plan file path for the user so they can open it in their IDE: Plan for review: <file-path>

Code review:

Collect the list of changed files:

git diff --name-only — unstaged changes
git diff --cached --name-only — staged changes

Merge unstaged + staged (unique paths). If both are empty:

git diff --name-only ${BASE_BRANCH}...HEAD — branch commits (fallback)

Branch diff is used ONLY when there are no local changes — otherwise context bloats. For branch diff, include the command git diff ${BASE_BRANCH}...HEAD (full diff) in the prompt.

The reviewer has access to the repo and will read full diffs and files on its own. In the prompt (step 4), pass the file list and which git diff commands to run.

Many files (> 50): if the combined list exceeds 50 paths, pass only git commands without the file list — the reviewer will figure it out.

If all sources are empty — no changes to review, inform the user.

Code-vs-plan review: prepare the plan path AND collect the list of changed files (as above).

Step 4: Build the prompt body, dispatch the runner subagent

Main thread composes the prompt BODY (without the session marker — the subagent adds it). Select the right template from below based on review mode.

Prompt body for plan review:

<role>
You are a senior adversarial reviewer of implementation plans.
Your job is to break confidence in the plan, not to validate it.
</role>

<operating_stance>
Default to skepticism. Assume the plan has gaps until the evidence says otherwise.
Do not give credit for good intent or likely follow-up work.
If something only works on the happy path, treat that as a real weakness.
</operating_stance>

<task>
Review the implementation plan in <plan-path>.
</task>

<attack_surface>
Check each area. Skip if not applicable:
- Feasibility — will this approach actually work given the codebase and constraints?
- Missing steps — what is forgotten or assumed but not stated?
- Risk areas — what could go wrong during implementation? Data loss? Downtime?
- Sequencing — are steps in the right order? Are there hidden dependencies?
- Alternatives — is there a simpler or more robust approach?
- Rollback — can this be safely reverted if it fails halfway?
- Security — auth, data exposure, injection, unsafe operations
</attack_surface>

<finding_bar>
Each finding MUST answer:
1. What can go wrong? (concrete scenario, not hypothetical)
2. Why is this plan vulnerable? (cite specific section)
3. Impact — what breaks and how badly?
4. Recommendation — specific change to the plan
</finding_bar>

<scope_exclusions>
DO NOT comment on: formatting, wording style, speculative issues without concrete trigger scenario.
</scope_exclusions>

<calibration>
Prefer one strong finding over several weak ones.
If the plan is solid, say so clearly — false positives erode trust.
</calibration>

<output_format>
Use markdown headers for sections: Summary, Findings, Verdict.

Summary: one paragraph — what this plan does and your overall assessment.

Findings: for each finding, use a sub-header with [severity: critical|high|medium] and title.
Include these fields per finding:
- **Section:** which part of the plan
- **What can go wrong:** ...
- **Why vulnerable:** ...
- **Impact:** ...
- **Recommendation:** ...

If no findings: "No actionable findings."

Verdict rules: approve if no findings or all low severity; revise if any high/critical.
Choose exactly one. The LAST line of your response must be one of:
VERDICT: APPROVED
VERDICT: REVISE
</output_format>

Prompt body for code review (≤ 50 files):

<role>
You are a senior adversarial code reviewer.
Your job is to break confidence in the change, not to validate it.
</role>

<operating_stance>
Default to skepticism. Assume the change can fail in subtle, high-cost,
or user-visible ways until the evidence says otherwise.
Do not give credit for good intent, partial fixes, or likely follow-up work.
If something only works on the happy path, treat that as a real weakness.
</operating_stance>

<task>
Review the code changes in this repo. Changed files:

<file list from --name-only>

Changes include: <unstaged changes / staged changes / unstaged + staged changes / branch changes vs ${BASE_BRANCH}>.
Run <git diff commands> to see the full diffs.
</task>

<attack_surface>
Check each area. Skip if not applicable to this change:
- Auth & permissions: bypasses, privilege escalation, missing checks
- Data integrity: loss, corruption, partial writes, constraint violations
- Race conditions: TOCTOU, concurrent access, deadlocks
- Rollback safety: can this change be safely reverted?
- Schema drift: migrations, backward compatibility, data format changes
- Error handling: swallowed errors, missing retries, cascading failures
- Observability: will operators know when this breaks?
</attack_surface>

<finding_bar>
Each finding MUST answer:
1. What can go wrong? (concrete scenario, not hypothetical)
2. Why is this code vulnerable? (cite specific file and lines)
3. Impact — what breaks and how badly? (data loss > downtime > degraded UX)
4. Recommendation — specific fix with code reference
</finding_bar>

<scope_exclusions>
DO NOT comment on: code style, formatting, naming conventions,
speculative issues without concrete trigger scenario,
"nice to have" improvements unrelated to correctness or safety.
</scope_exclusions>

<calibration>
Prefer one strong finding over several weak ones.
Severity: critical (data loss/security) > high (bug in prod) > medium (edge case).
If the change is solid, say so clearly — false positives erode trust.
</calibration>

<output_format>
Use markdown headers for sections: Summary, Findings, Verdict.

Summary: one paragraph — what this change does and your overall assessment.

Findings: for each finding, use a sub-header with [severity: critical|high|medium] and title.
Include these fields per finding:
- **File:** path/to/file.ext lines N-M
- **What can go wrong:** ...
- **Why vulnerable:** ...
- **Impact:** ...
- **Recommendation:** ...

If no findings: "No actionable findings."

Verdict rules: approve if no findings or all low severity; revise if any high/critical.
Choose exactly one. The LAST line of your response must be one of:
VERDICT: APPROVED
VERDICT: REVISE
</output_format>

Prompt body for code review (> 50 files):

Same as ≤ 50 files above, but the <task> section is replaced with:

<task>
Review the code changes in this repo.
Changes include: <unstaged changes / staged changes / ...>.
Run <git diff commands> to see changed files and full diffs.
</task>

Prompt body for code-vs-plan review:

Same as code review (≤ 50 or > 50 variant depending on file count), but:

<task> is extended to reference the plan file: Review the code changes in this repo against the implementation plan in <plan-path>.

<attack_surface> appends these three items:

- Completeness: does the implementation cover all plan steps?
- Deviations: where does the code differ from the plan? Are deviations justified?
- Missing: what from the plan is not yet implemented?

Substitute template placeholders BEFORE writing to disk:

The inlined prompt bodies above contain template placeholders that main must resolve with real captured values before the Write. Placeholders per mode:

Placeholder	Value source	Applies to
`${BASE_BRANCH}`	captured at Step 2 (code & code-vs-plan only)	code, code-vs-plan
`<plan-path>`	captured at Step 3	plan, code-vs-plan
`<file list from --name-only>`	result of `git diff --name-only` + `git diff --cached --name-only` (or branch diff) from Step 3	code, code-vs-plan (≤50 files only)
`<unstaged changes / staged changes / ...>`	human-readable description derived from which diff commands had content	code, code-vs-plan
`<git diff commands>`	the exact commands main determined at Step 3 (e.g. `git diff`, `git diff --cached`, `git diff ${BASE_BRANCH}...HEAD`)	code, code-vs-plan

Substitute ${BASE_BRANCH} first (it appears nested inside <unstaged changes / staged changes / ...>), then compute the outer human-readable description based on which diffs have content. Main writes the substituted string to the Write tool — no template placeholders should remain in the body file sent to the runner.

Capture user overrides for CODEX_MODEL / CODEX_REASONING at Step 1:

The skill supports overrides like /adversarial-review xhigh, /adversarial-review medium, /adversarial-review model:gpt-5.3-codex. At Step 1, capture:

CODEX_MODEL — default gpt-5.4. Overridden by any argument matching ^model:(.+)$; use the capture group.
CODEX_REASONING — default high. Overridden by any argument exactly matching low, medium, high, or xhigh.

These are passed into the runner YAML input block below.

Write the prompt body to disk via Write tool:

Write /tmp/codex-body-${REVIEW_ID}.md containing the substituted body text (no session marker — the runner adds it).

Plan Mode note: Writing to /tmp via Write tool may trigger a permission prompt or exit Plan Mode. This is a known Claude Code limitation. Additionally, dispatching a subagent under Plan Mode may inherit the restriction — empirical behavior documented in DESIGN.md §12.7.

Resolve the runner spec path:

The runner spec lives at references/runner.md within the skill's install directory. Main cannot reliably introspect Claude Code's skill-invocation header from inside its own context (there is no tool for reading one's own system prompt — any attempt would be a hallucination risk). Therefore the discovery uses only concrete filesystem checks, in this priority order:

User-scoped install (primary): check ~/.claude/skills/adversarial-review/references/runner.md:

ls ~/.claude/skills/adversarial-review/references/runner.md 2>/dev/null

If exit 0, set RUNNER_SPEC_PATH to the expanded absolute path and proceed.

Plugin-marketplace install (secondary): Claude Code's plugin system installs skills at paths like ~/.claude/plugins/cache/<marketplace>/<plugin>/<version>/skills/adversarial-review/. Glob to find it:

ls ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md 2>/dev/null | head -1

If the Glob returns one or more paths, take the first and set RUNNER_SPEC_PATH.

Dev checkout (tertiary): if neither above, try $(git rev-parse --show-toplevel)/references/runner.md:

REPO=$(git rev-parse --show-toplevel 2>/dev/null) && ls "$REPO/references/runner.md" 2>/dev/null

Abort: if no path yields a readable file, tell the user: Could not locate references/runner.md. Expected locations: (1) ~/.claude/skills/adversarial-review/references/runner.md, (2) ~/.claude/plugins/cache/*/*/*/skills/adversarial-review/references/runner.md, (3) $(git rev-parse --show-toplevel)/references/runner.md. Re-install the skill. Abort the skill.

Save the resolved absolute path as RUNNER_SPEC_PATH. Do NOT attempt to extract the path from any "Base directory for this skill:" line in the conversation — that line is a system injection Claude cannot reliably read from inside its own context.

Dispatch the runner subagent via Agent tool:

Do NOT Read ${RUNNER_SPEC_PATH} in main. Pass the path to the subagent; it reads the spec itself. This keeps runner.md (~12K) out of main's context — both the Read result AND the Agent prompt duplication. Saves ~12K per round × up to 5 rounds per review.

Invoke the Agent tool with:

subagent_type: "general-purpose"
model: "sonnet"
description: "Adversarial-review runner, round N" (N is the current round number)
prompt: a short bootstrap instruction + YAML input block (no inlined runner.md):

Read your full instruction spec at ${RUNNER_SPEC_PATH} and follow the steps there using this input:

---
REVIEW_ID: 1711872000-48217593
REPO_ROOT: /home/dementev/sources/myproject
OPERATION: initial
CODEX_MODEL: gpt-5.4
CODEX_REASONING: high
PROMPT_BODY_PATH: /tmp/codex-body-1711872000-48217593.md
RESULT_PATH: /tmp/codex-runner-result-1711872000-48217593.json
---

Substitute the actual resolved ${RUNNER_SPEC_PATH} (absolute path) and real values for every other placeholder. RESULT_PATH always follows the pattern /tmp/codex-runner-result-${REVIEW_ID}.json.

Do NOT run the Agent tool call in background. Wait for the subagent to return. (Runner's own codex exec is also synchronous per runner Step R3.)

Parse the subagent's response — two-channel protocol:

Apply the regex RUNNER_RESULT_AT:\s+(\S+) (UNANCHORED — matches anywhere in the Agent tool's result text, tolerant of markdown fences and preamble). Take the first match's capture group as the result-file path.

If the regex finds NO match in the subagent's response, fall back to a Glob for the deterministic path /tmp/codex-runner-result-${REVIEW_ID}.json — REVIEW_ID is already known to main. If Glob also returns nothing, treat as infra_error with errors: "runner did not write result file at deterministic path and did not emit RUNNER_RESULT_AT line" and abort.

Read the file at the resolved path. Parse as JSON. Extract result, verdict, review_file, codex_session_id, errors, user_warning, archived_stdout, archived_stderr.

If user_warning is non-null, surface it as a SEPARATE short user-visible message BEFORE the Step 5 verbatim-review message. Format:

⚠ <user_warning contents>

Emit this on its own turn — do NOT concatenate into the Step 5 ## Adversarial Review — Round N header message (that message's body must remain the review's verbatim content, nothing else). Emit the warning FIRST, then the Step 5 message. This preserves both the pre-refactor §2.4.4 "no-op refresh" diagnostic AND the Step 5 verbatim-display contract.

Dispatch based on result:

`result` value	Main thread action
`success`	Save `codex_session_id` (keep prior if `null` per §2.4.4). Surface `user_warning` if set. Proceed to Step 5.
`timeout`	TERMINAL — do NOT re-dispatch. Runner already attempted twice internally (R4.1 + R5 retry = 2 × 10min). Tell user: "Reviewer timed out after two attempts (20 minutes total)." Abort the skill. User can re-invoke `/adversarial-review` to start a fresh review.
`launch_failure`	TERMINAL — do NOT re-dispatch. The runner already retried once internally (Step R5). Show `errors` to user, abort the skill. This keeps the total-attempts-per-round invariant at 2 (matches pre-refactor: 1 initial + 1 retry).
`infra_error`	Show `errors` to user (infrastructure: /tmp not writable, stderr file missing, RUNNER_RESULT_AT line absent). Abort.
`input_error`	Bug in orchestration. Show `errors` to user. Abort.

Round-level attempt invariant: exactly ONE runner dispatch per round. Every failure result is terminal at main. The runner owns the full retry budget (≤2 attempts per dispatch, internal) regardless of failure type. Total codex invocations per round ≤ 2.

CRITICAL — main thread does NOT read stdout/stderr/JSONL/rollout files BY CONTENT. Those live and die inside the subagent. Main reads: the runner result JSON at RESULT_PATH, the review file at review_file, and nothing else from /tmp/codex-*. Archival mv (on resume failure) is done by the runner, not main — main never references /tmp/codex-stdout-* or /tmp/codex-stderr-* in any Bash argv.

Step 5: Read the review, show it, then check the verdict

1. Read the review file. Read /tmp/codex-review-${REVIEW_ID}.md.

2. Semantic sanity checks. The file MUST pass all of these:

Exists and is non-empty.
Contains a line exactly matching ^VERDICT: (APPROVED|REVISE)$.
If verdict is REVISE → file must also contain at least one line matching \[severity:\s*(critical|high|medium) (i.e. at least one structured finding).

If any check fails → this is a launch failure (model produced no actionable review):

Show the user the /tmp/codex-stderr-${REVIEW_ID}.txt contents (if any) AND the raw review file.
Offer ONE retry of Step 4 (re-launch the same round). Retry does NOT consume the round counter — the round counter advances only when a valid review is produced.
Track the retry counter in your current round's reasoning only. The counter resets at the start of every new round.
After a failed retry → hard abort the skill. Do NOT route to the Step 7 fresh-exec fallback (that path is for resume failures in rounds 2+, and depends on prior-round content).

3. Show the review to the user. This is mandatory and blocking.

Your next user-visible message that is NOT a one-line ⚠ <warning> diagnostic must begin with the header below, followed by the file contents verbatim. Not "I've received the review", not "The reviewer said:", not a summary — the literal file content.

Do NOT wrap the review in a code fence (the review is already markdown, and an outer fence would break on inner fences).

Do NOT call any Edit, Write, or fix-applying tool in the same message as the review. The review output is a standalone user-visible message.

Message format:

## Adversarial Review — Round N (mode: <plan|code|code-vs-plan>, model: gpt-5.4)

<verbatim contents of /tmp/codex-review-${REVIEW_ID}.md>

4. Only AFTER the review message has been sent — parse the VERDICT line and dispatch:

VERDICT: APPROVED → Step 8 (Done).
VERDICT: REVISE → Step 6 (Fixes).
Maximum rounds reached (5 rounds) → Step 8 with the max-rounds note.

Step 6: Apply fixes

Precondition gate (check first). Before calling any Edit, Write, or other fix-applying tool: confirm that you have already sent a user-visible message in THIS round whose body contains the verbatim review text (short ⚠ <user_warning> diagnostic messages do NOT count). If you have not — STOP. Go back to Step 5 and send the review message now. This is the same rule that protects the "user sees the review" contract; a literal reader may otherwise slip past it.

Based on the reviewer's findings:

For plan review: fix the plan — address each finding. Update the plan file (or temp file). Show the user:

### Fixes (Round N)
- [What was changed and why, one item per finding]

For code review: fix the code directly — edit files, run tests if applicable. Show the user:

### Fixes (Round N)
- [What was fixed and why, one item per finding]

Skip a fix if it contradicts the user's explicit requirements — note this for the user.

Step 7: Resubmit to Codex (Rounds 2-5)

Resume is the primary path. Saves tokens and preserves session context. A fresh codex exec without resume is an emergency fallback when resume itself fails.

Step 7.1: Write the resume prompt body to disk.

Write /tmp/codex-resume-body-${REVIEW_ID}.md containing:

I've revised based on your feedback.

Here's what I changed:
[List of fixes from Step 6]

Re-review with the same adversarial stance. Focus on:
1. Whether my fixes actually resolve the reported issues
2. Any NEW issues introduced by the fixes

End with VERDICT: APPROVED or VERDICT: REVISE

Substitute the fixes list from Step 6 (one bullet per finding addressed). Do NOT include the session marker — the subagent adds it.

Step 7.2: Dispatch the runner subagent for resume.

Same Agent tool invocation as Step 4 (bootstrap instruction with ${RUNNER_SPEC_PATH} + YAML input block; subagent Reads the spec itself). Reuse the RUNNER_SPEC_PATH resolved in Step 4 (do not re-resolve). Input block:

---
REVIEW_ID: <same as initial round>
REPO_ROOT: <same>
OPERATION: resume
CODEX_MODEL: <same>
CODEX_REASONING: <same>
PROMPT_BODY_PATH: /tmp/codex-resume-body-<REVIEW_ID>.md
RESULT_PATH: /tmp/codex-runner-result-<REVIEW_ID>.json
CODEX_SESSION_ID: <uuid from previous round's runner result>
---

Step 7.3: Parse the two-channel result.

Extract RUNNER_RESULT_AT: line (same tolerant regex + Glob fallback as Step 4), read the JSON file, extract fields. If user_warning is non-null, emit it as its own ⚠ <user_warning> message BEFORE any other action (including before the Step 5 verbatim review) — see Step 4's user_warning rule.

`result` value	Main thread action
`success`, verdict `APPROVED`	Read `review_file`, go to Step 5 (it will dispatch to Step 8 on APPROVED).
`success`, verdict `REVISE`	Save new `codex_session_id`. If the subagent returned null (zero-find resume), keep the prior id per §2.4.4 — `user_warning` will already have been surfaced. Go to Step 5.
`timeout`	TERMINAL for this round — runner already attempted twice. Route to fallback below. (Fresh-exec is a NEW round from the 5-round counter — its own ≤2-attempts budget applies.) No user-offered retry; that would compound.
`launch_failure`	TERMINAL for this round — runner already retried once internally. Route to fallback below (runner already archived stdout/stderr to `-failed-resume.*` — paths in `archived_stdout` / `archived_stderr`).
`infra_error`	Show `errors` to user, abort.

Round-level attempt invariant: exactly ONE runner dispatch per resume round. Every failure result routes to fallback (not re-dispatch within the same round). Fallback's fresh-exec dispatch consumes a NEW round from the 5-round counter, which has its own independent 2-attempts-per-round budget. Total codex invocations per round ≤ 2 regardless of failure type — matches pre-refactor; closes Round-2 finding #1.

Step 7.4: Fallback chain — triggered by launch_failure or repeated timeout from the runner.

Severity classification: parse the PREVIOUS round's review (kept in conversation history from Step 5.3's verbatim display) for the highest [severity: level. Default to critical if zero matches (format drift).

Interactive mode (direct user message earlier in this session): ask the user:

Resume failed — the reviewer's re-review did not produce a usable result.
Last round's maximum severity: <level>.

Options:
(a) Run a fresh `codex exec` with full previous-rounds context (higher token cost, new session)
(b) Conclude the review — show current findings as NOT VERIFIED

Non-interactive mode:

Max severity critical or high → fresh exec automatically.
Max severity medium only → Step 8 with the not-verified terminal state.

Fresh-exec dispatch: build a new PROMPT_BODY that is the original Step 4 prompt for the current mode, followed by sections ## Previous review rounds (verbatim round-1..N reviews + fixes from conversation history) and ## Current state of the artifact. Write to /tmp/codex-body-${REVIEW_ID}.md (overwriting the original).

Archival note: if the fallback was triggered by launch_failure, the runner already archived failed-resume stdout/stderr to -failed-resume.* paths during Step R5 — main does NOT need to mv anything. If triggered by repeated timeout, no archival happened (no second codex invocation produced useful diagnostics); main can proceed directly. Either way, main never touches /tmp/codex-stdout-* or /tmp/codex-stderr-* itself.

Dispatch the runner subagent with OPERATION=fresh-exec (same input schema, new PROMPT_BODY_PATH pointing at the rebuilt prompt). The fresh-exec consumes one round from the 5-round counter. Return to Step 5 with the new review.

Step 8: Final result

Approved:

## Adversarial Review — Summary (mode: <mode>, model: gpt-5.4)

**Status:** Approved after N round(s)

[Final review]

---
**Reviewed and approved by the reviewer. Awaiting your decision.**

Maximum rounds reached:

## Adversarial Review — Summary (mode: <mode>, model: gpt-5.4)

**Status:** Maximum reached (5 rounds) — not fully approved

**Remaining findings:**
[Unresolved issues]

---
**The reviewer still has findings. Please review them and decide how to proceed.**

Not verified (resume failed and the operator chose to conclude, or headless with only medium severity):

## Adversarial Review — Summary (mode: <mode>, model: gpt-5.4)

**Status:** NOT VERIFIED — fixes applied, reviewer did not re-verify

**Last round's findings:**
[Verbatim findings from the last successful round]

**Applied fixes:**
[List of fixes per finding]

---
**WARNING: This is NOT an approval. Fixes were applied but never verified by the reviewer. Manual review is required before merging.**

Step 9: Cleanup

Conditional on terminal state:

Terminal state	Cleanup behavior
Approved	Remove all temp files
Maximum rounds reached	Remove all temp files
Not verified (fallback conclude)	Remove all temp files
Aborted (launch failure, redirect failure, infrastructure error)	LEAVE files in place for diagnostics

In Claude Code Plan Mode: skip all cleanup (including deferred). rm will trigger a permission prompt. Files will be cleaned up on the next invocation outside Plan Mode.

Outside Plan Mode, on a cleanup-eligible terminal state:

rm -f /tmp/codex-plan-${REVIEW_ID}.md \
      /tmp/codex-prompt-${REVIEW_ID}.md \
      /tmp/codex-resume-prompt-${REVIEW_ID}.md \
      /tmp/codex-review-${REVIEW_ID}.md \
      /tmp/codex-stdout-${REVIEW_ID}.jsonl \
      /tmp/codex-stderr-${REVIEW_ID}.txt \
      /tmp/codex-stdout-${REVIEW_ID}-failed-resume.jsonl \
      /tmp/codex-stderr-${REVIEW_ID}-failed-resume.txt \
      /tmp/codex-body-${REVIEW_ID}.md \
      /tmp/codex-runner-result-${REVIEW_ID}.json

If the user declined rm — continue without error.

Do NOT delete plan files that existed before the review (only temp files created by this skill). On abort paths, old temp files remain for diagnostics and will be cleaned up by the OS on reboot, or overwritten by the next invocation using the same REVIEW_ID (collision probability is ~10⁻⁸ per same-second run).

Rules

Claude actively fixes issues based on reviewer feedback — this is NOT just message forwarding.
Reviewer findings are shown verbatim — do not rephrase or shorten. The Step 5 "YOUR NEXT MESSAGE" instruction is blocking.
Auto-detect review mode from context; user arguments take priority.
With explicit plan argument or in Claude Code Plan Mode: skip git checks and base branch detection.
REPO_ROOT is captured at Step 2 and passed as an absolute literal to every runner dispatch.
RUNNER_SPEC_PATH is resolved at Step 4 (once per review) with priority: (1) ~/.claude/skills/adversarial-review/references/runner.md (user-scoped install), (2) Glob ~/.claude/plugins/cache/**/skills/adversarial-review/references/runner.md and take first hit (plugin-marketplace install), (3) $(git rev-parse --show-toplevel)/references/runner.md (dev checkout), (4) abort with installation error. Main never attempts to read Claude's own system prompt / skill-invocation header — that path is hallucination-prone and is explicitly disallowed.
Codex-exec mechanics live in the runner subagent (references/runner.md): ATTEMPT_ID generation, prompt-with-marker writing (with a repeated Write call for mtime freshness — NOT Bash touch, which may be gated by inherited Plan Mode), synchronous launch, strict checks, two-tier session-id capture with positive content-bind, ONE internal retry on ANY failure type (launch_failure, timeout, stderr-infra), archival mv on resume failure. Main thread never reads codex stdout/stderr/rollout file CONTENT, and never references those paths in its own Bash argv.
Two-channel result protocol. Runner writes structured JSON to /tmp/codex-runner-result-${REVIEW_ID}.json (authoritative) AND returns a single RUNNER_RESULT_AT: <path> line as its final message. Main extracts the path via regex (tolerant to markdown fences / minor wrapping), reads the JSON, and never relies on raw-JSON-in-message parsing.
Main thread reads only: the runner result JSON at RESULT_PATH and the review file at review_file. Main does NOT Read references/runner.md — the runner spec is passed by path to the subagent, which Reads it itself. No other /tmp/codex-* reads.
Runner is dispatched via Agent tool with subagent_type: general-purpose, model: sonnet. Agent tool call is synchronous (not run_in_background).
ALL runner failure results are TERMINAL at main (launch_failure, timeout, infra_error, input_error). Runner retries once internally on ANY failure. Main does NOT re-dispatch and does NOT offer the user a retry — those lanes would compound retries across layers. Total codex invocations per round ≤ 2 (matches pre-refactor invariant: 1 initial + 1 retry). Fresh-exec fallback is a NEW round with its own independent 2-attempts budget.
user_warning from the runner must be surfaced to the user on a single line BEFORE any other action. This preserves the pre-refactor §2.4.4 "both tiers empty, continuing with previous ID" diagnostic.
CODEX_MODEL / CODEX_REASONING in the runner input schema refer to the model codex CLI launches (e.g. gpt-5.4). The runner's OWN model is Sonnet, set via Agent tool's model: "sonnet". Do NOT conflate.
Resume is the primary path for rounds 2-5. Fresh-exec fallback consumes one round from the 5-round counter.
Step 9 cleanup rm glob is UNCHANGED from pre-refactor. It still covers /tmp/codex-plan-${REVIEW_ID}.md, /tmp/codex-prompt-${REVIEW_ID}.md, /tmp/codex-resume-prompt-${REVIEW_ID}.md, /tmp/codex-review-${REVIEW_ID}.md, /tmp/codex-stdout-${REVIEW_ID}.jsonl, /tmp/codex-stderr-${REVIEW_ID}.txt, /tmp/codex-stdout-${REVIEW_ID}-failed-resume.jsonl, /tmp/codex-stderr-${REVIEW_ID}-failed-resume.txt. ADD the two new paths introduced by the refactor: /tmp/codex-body-${REVIEW_ID}.md and /tmp/codex-runner-result-${REVIEW_ID}.json.
Cleanup is conditional on terminal state: remove temp files on approved/max-reached/not-verified; LEAVE them on abort (diagnostic value). Skip all cleanup in Plan Mode.
Always read-only sandbox — reviewer never writes files.
Maximum 5 rounds to protect against infinite loops.
Show the user reviews and fixes for each round.
If Codex CLI is not installed or crashed — tell the user: npm install -g @openai/codex.
If a fix contradicts the user's explicit requirements — skip and explain why.

ライセンス: Apache-2.0(寛容ライセンスのため全文を引用しています) · 原本リポジトリ

adversarial-review

SKILL.md 本文