ClosedLoop.AI
ClosedLoop.AI Student Workbook

Claude Code One-Day Intensive Expanded Student Workbook

A hands-on follow-along workbook with repo-backed examples, file locations, commands, exercises, and desk-reference patterns for the full five-hour Claude Code intensive.

FormatFive-hour live intensive
AudienceIB Global Engineering
OutputReusable operating artifacts

How to Use This Workbook

This workbook turns the one-day agenda and instructor presentation guide into a readable companion. You want more than slide bullets — a practical operating model, concrete artifact templates, and enough explanation to keep applying the ideas after today’s session ends.

This session assumes mandatory setup pre-work. You should arrive with Claude Code working, the repository cloned, demo commands verified, editor and terminal ready, and your baseline tool-permission posture understood. Do not spend live time repairing local setup; spend it practicing primitive design, planning, investigation, review, and workflow design.

Course outcomes: By the end of today, you should have a Claude Code primitive kit, Implementation Plan, short Explore findings, Request Changes, Review, Workflow, and one Workflow retro.
Why this pays off: None of these artifacts are paperwork — each one is leverage. A primitive kit stops the team re-solving solved problems. A compact plan lets the model land a change in one pass instead of five. An Explore findings prevents the expensive regression before it ships. Token discipline lowers your cost-to-serve on every run. The return is concrete: fewer wasted sessions, faster time-to-shipped, and reusable assets that compound each time another engineer picks them up.
One running example: A single scenario threads the whole day — an evidence-first triage and fix of a refresh-token rotation bug. The artifacts build on each other by pointing at the ones before them: the Implementation Plan (Module 2) frames the Explore findings and Request Changes (Module 3), which feed the Review (Module 4), which the workflow (Module 5) ties into one repeatable loop. By the end you have walked a complete evidence-first bug fix, end to end.

Course Map

One-day timeboxes

ModuleTimeHands-on centerMust leave with
1. Build Claude Code primitives70 minCreate tool posture, command, skill, subagent, plugin noteClaude Code primitive kit
2. Planning and context management55 minConvert messy intake into Implementation PlanImplementation Plan and context map
3. Intent recovery and dynamic evidence60 minRecover why from git/issues/logs and build Request ChangesExplore findings and Request Changes
4. Review, test, and verify60 minReview diff against brief and produce ReviewFindings and Review
5. Workflow and improvement loop45 minDesign named workflow with handoffs, gates, stop conditionsWorkflow and improvement note

The recurring critique pattern

Every module uses the same critique pattern because the course is artifact-first. Do not ask only whether an artifact looks polished. Ask whether it can serve as a downstream input.

What would you improve about this artifact? What would make it better as a downstream input? Could another operator use this without reopening the whole problem?

This pattern is intentionally simple. It works for primitive kits, Implementation Plans, Explore findingss, Request Changess, Reviews, and Workflows. The goal is to train you to judge artifacts by operational usefulness rather than by surface completeness.

Start Here: Student Follow-Along Guide

Your goal: Leave with seven small artifacts you can reuse at work: a primitive kit, Implementation Plan, Explore findings, Request Changes, Review, Workflow, and one Workflow retro.

This workbook is the student version of the one-day intensive. It removes instructor-only delivery notes and turns the course into a practical follow-along reference. Keep the GitHub repository open while you read, because the examples, anti-patterns, pre-work, and commands are part of the exercise flow.

How to use each module

  1. Read the framing and vocabulary before the live block starts.
  2. Open the linked example and anti-pattern artifacts from the repo.
  3. Build the named artifact for that block.
  4. Use the critique prompt: What would make this better as a downstream input?
  5. Keep your final artifacts short enough for another operator to use without reopening the whole conversation.

Module 1 · Block 1 · 70 minutes

Build Claude Code Primitives

Claude Code becomes useful at team scale when engineers stop treating it as a single chat box and start treating it as a primitive build lab: a collection of tools, commands, skills, agents, subagents, plugins, execution modes, and model choices. This module teaches the primitive design judgment that keeps work small, explicit, and repeatable.

Module outcome: You leave Module 1 with a Claude Code primitive kit: a safe tool posture, one custom command, one skill skeleton, one subagent, and one plugin decision note. The concept recap comes after the build, not before it.

Hands-on: build the Claude Code primitive kit

  1. Create a four-line tool posture: allowed reads, allowed shell commands, writes requiring approval, and dangerous actions that stay blocked.
  2. Create a custom command at .claude/commands/summarize-failing-test.md.
  3. Create a skill skeleton at .claude/skills/flaky-test-investigation/SKILL.md using the course template.
  4. Create a subagent at .claude/agents/history-investigator.md or .claude/agents/security-reviewer.md.
  5. Write a plugin decision note: not yet, team-local, or package later.
  6. Peer review: could another operator use this kit without reopening the whole problem?

Token-efficient operating habit

Build the smallest useful abstraction. A direct prompt is cheaper than a skill, a skill is cheaper than an always-loaded rule, and a subagent is cheaper than polluting the main thread with broad exploration.

Why primitive design is the first skill

Most failed AI-assisted engineering sessions do not fail because the model is incapable. They fail because the work is routed to the wrong primitive. A developer asks for a broad implementation when the real need is investigation. A team writes a permanent rule into a one-off prompt. A repeated checklist stays trapped in someone’s memory instead of becoming an executable skill. A heavyweight model is used for cheap file discovery, while a complex design decision is handed to a fast model with insufficient reasoning depth.

The first day of a Claude Code operating model therefore begins with vocabulary. Not vocabulary for its own sake, but vocabulary that gives the team a shared way to decide where work should live. Once the team can say “this is a command,” “this is a skill,” “this belongs in memory,” “this should be a subagent,” “this is a hook,” or “this needs a headless run,” the tool stops being mysterious and starts becoming an engineering system.

The primitive build lab

Think of Claude Code as a workbench with several surfaces. The interactive session is where a human and model negotiate a task. Tools are the model’s hands: reading files, running commands, searching repositories, editing code, and interacting with configured integrations. Commands are named entry points that standardize common activities. Skills package reusable procedures. Agents and subagents isolate specialized work. Plugins and marketplace content extend what is available across projects or organizations. Headless execution turns the same operating model into automation.

The practical question is not “which feature is coolest?” The practical question is: where should this work be represented so that another engineer can reuse it without rediscovering it?

Primitive Best use Team-scale signal Common mistake
PromptOne-off instruction, exploration, or clarificationUseful when the work is new, ambiguous, or conversationalUsing prompts repeatedly for stable procedures
CommandA named local action or workflow entry pointUseful when the same operation should start the same way every timePacking too much reasoning policy into a command name
SkillA reusable procedure, checklist, transformation, or review patternUseful when a workflow should be invoked on demand and updated centrallyPutting always-needed facts in a skill instead of memory or project instructions
AgentA specialized role with its own instructions and tool boundariesUseful when a class of work needs a consistent expert lensCreating broad agents with vague responsibilities
SubagentIsolated investigation or parallel work unitUseful when exploration would pollute the main context windowLetting broad file reads accumulate in the main conversation
ToolA capability the model calls: a built-in, a CLI binary run via Bash, or an MCP server’s toolUseful when work needs real actions — run, read, search, call an APIWriting a “tool definition” for a CLI Claude could just run under permissions
HookA shell command wired to a lifecycle event (PreToolUse, PostToolUse, Stop, and others)Useful when a rule must run deterministically, not only when the model remembersUsing a hook for soft guidance that belongs in CLAUDE.md, or vice versa
Plugin / marketplace packageReusable bundle of the above — commands, skills, subagents, hooks, tools — distributed beyond one repoUseful when teams need a shared, versioned extension pointPackaging before the workflow has stabilized
Headless runNon-interactive execution in CI, scripts, or automationUseful when the work has clear inputs, outputs, and stop conditionsAutomating work that still requires human judgment

Interactive versus headless work

Interactive mode is for discovery, negotiation, and judgment. A human can interrupt, correct assumptions, ask for alternatives, and decide whether the model’s next step is safe. Headless mode — Claude Code’s non-interactive print mode, invoked with claude -p (--print) — is for work that has already been bounded. It needs clear inputs, allowed tools, expected outputs, and stop conditions. If the task still requires a human to decide what the task is, it is not ready for headless execution.

A good rule is this: interactive sessions produce artifacts; headless sessions consume artifacts. During a live session you might create an Implementation Plan, Explore findings, or Review. Once those artifacts are stable, a headless run can implement a bounded change, run a review, or generate a report from known inputs.

Checkpoint: Before automating a Claude Code workflow, ask whether another operator could execute it from the artifact alone. If the answer is no, the workflow is still too implicit.

Goal mode and debate mode

Two conversational modes matter today. Goal mode is useful when you want the model to drive toward a defined outcome. Debate mode is useful when you want the model to challenge a plan before code is written. Goal mode helps with forward motion; debate mode helps with error prevention. Both are most useful when paired with artifacts.

For example, a developer might ask Claude to draft an Implementation Plan in goal mode. Once the brief exists, the developer can switch into a debate posture: “Challenge this brief. Identify hidden assumptions, underspecified acceptance criteria, and likely regression risks.” The output of the debate should not be a wandering conversation. It should be a better brief.

Model selection for the job

Model selection is part of work primitive design. Cheap, fast models are appropriate for bounded lookup, file discovery, and summarizing known material. Balanced models are appropriate for routine implementation and review. The strongest reasoning models are appropriate for architecture, tricky debugging, multi-file refactors, and decisions where a bad plan is more expensive than slower planning.

Job Preferred primitive design Why
Find relevant filesExplore subagent on a fast modelBroad search stays out of the main context window
Design a refactorStrong reasoning model for planning, then balanced model for executionThe plan is the expensive part to get wrong
Apply a known checklistSkill or commandThe procedure should be stable and repeatable
Review a security-sensitive changeSpecialized review agent with high effortThe lens and depth matter more than speed
Generate a one-time explanationInteractive promptThe work is conversational and may not need persistence
Run a recurring reportHeadless workflow over a stable specThe inputs, outputs, and schedule are known

Building the Claude Code primitive kit

The primitive kit is the first durable artifact of the day. It is the set of primitives you have actually crafted — a tool posture, a command, a skill, a subagent, and a plugin decision — each written down with explicit boundaries. Give every primitive a compact spec so another engineer can pick it up without guessing: what it does, which mode and model it runs in, what it needs as input, what it produces, and when it should stop. Keep the kit small enough to live in a repo, onboarding guide, or team operating doc, and concrete enough to describe the actual next version of the team’s workflow rather than an aspiration.

Below is a real, working code-review plugin that bundles every primitive in one place. Use the dropdown to walk through each: the plugin manifest that ties them together, a command, a skill, a subagent, a hook, and a bundled tool (a CLI script). Edit any of them — the box turns green when the format is valid.

It is one connected plugin, not six unrelated files: the /review-pr command runs the pr-review skill and delegates to the pr-reviewer subagent, which runs the check-diff tool and the github MCP server. That is how primitives stitch into a workflow. The layout mirrors a real plugin — closedloop-ai/claude-plugins/plugins/code.

A note on tools: you rarely “define” a tool. Claude already uses its built-in tools and any CLI binary on your PATH — run via Bash under your permission rules — so it is usually smart enough to just use them. The two ways to extend the set are to ship a helper script in the plugin (like check-diff.sh) or to add an MCP server, which exposes entirely new tools.

Task: Review a PR for auth regressions
Primitive: security-reviewer agent + /code-review command
Mode: interactive for local development; headless only after the rule set is stable
Model: balanced model for normal review, stronger reasoning for high-risk auth changes
Inputs: diff, Implementation Plan, REVIEW.md, relevant auth rules
Output: findings-first review with severity, evidence, and recommended fix
Stop condition: no important findings or explicit residual risk accepted by human

What good looks like

A good primitive kit has three qualities. First, it is specific. “Use Claude for coding” is not useful; “use an explorer subagent to identify files before reading them into the main session” is useful. Second, it is bounded. Each entry says what the primitive should and should not do. Third, it is teachable. A new engineer should be able to read the kit and make the same primitive design decision as a senior engineer most of the time.

Exercise: Pick five recurring engineering activities from your team. For each, decide whether it belongs as a prompt, command, skill, agent, subagent, plugin, or headless workflow. Then add the model choice, required input artifact, expected output artifact, and stop condition.

Anti-patterns

From primitives to workflows

Crafting primitives is only half of Module 1. The other half is noticing that primitives are meant to be stitched together. A real task rarely uses one primitive in isolation: an explorer subagent finds the files, a command kicks off the change, a review agent checks the diff, a skill packages the verification. The sequence that connects them is a workflow — and Module 5 is where you design one deliberately.

That immediately raises the question this course spends the rest of the day answering: once you can stitch primitives into a workflow, how do you tell that workflow what to accomplish? A workflow with no clear goal, no boundaries, and no definition of done will wander no matter how good its primitives are. That is exactly the problem Module 2 picks up.

Module recap

The foundation of effective Claude Code use is not prompt cleverness. It is primitive design discipline. Teams that craft the right primitives — and know how those primitives stitch into workflows — can preserve knowledge, reduce repeated prompting, avoid context-window waste, and turn successful sessions into reusable operating assets.

Module 2 · Block 2 · 55 minutes

Planning and Context Management

Module 1 ended on a question: once you can stitch primitives into a workflow, how do you tell it what to accomplish? Planning is the answer. In Claude Code, planning is not a ceremonial step before coding — it is the act of shaping context so the next operator (human, model, agent, or headless workflow) can act without reopening the entire problem.

Module outcome: You create an Implementation Plan with facts, assumptions, open questions, context map, bounded work packages, and acceptance criteria.

Hands-on: write an Implementation Plan

  1. Choose a task.
  2. Separate facts, assumptions, open questions, constraints, non-goals, and acceptance criteria.
  3. Add a context map with file pointers and evidence commands.
  4. Write a guided /compact focusing on... prompt.

Token-efficient planning habit

Every Implementation Plan should include a context budget: what must load, what can defer, what should delegate, what should not load, and what must survive compaction.

The plan is a compression artifact

Claude Code sessions can accumulate enormous context: pasted requirements, file contents, terminal output, attempted fixes, test failures, screenshots, and corrections from the human. Without deliberate compression, the session becomes expensive and fragile. The model is forced to infer what still matters from a long transcript. Humans are forced to remember why earlier decisions were made. Downstream operators inherit noise instead of a plan.

A useful plan is not a transcript. It is a lossily compressed representation of the work. It preserves the facts, decisions, constraints, risks, and next actions that matter. It discards the conversational path that produced them. That is why this session treats planning as context management.

Separate facts, assumptions, and open questions

The simplest improvement to most AI coding sessions is to stop blending known facts with guesses. Models are very good at continuing a confident narrative. If the prompt says “the auth middleware probably owns refresh-token invalidation,” the model may proceed as if that is true. A disciplined brief separates evidence from inference.

Category Definition Example How to handle it
FactA statement backed by direct evidence`src/auth/middleware.ts` validates JWTs before route handlers runCan be used directly in the plan
AssumptionA plausible statement not yet provenRefresh token invalidation is probably handled in the session storeMust be tested or called out as risk
Open questionA decision or unknown that blocks confident executionShould expired refresh tokens be deleted or retained for audit?Resolve before implementation or explicitly defer
ConstraintA boundary the solution must respectDo not change public API response shapeUse as acceptance criteria and review rule
Checkpoint: Find the single most load-bearing assumption in your plan. If it turns out to be wrong, does the whole approach collapse? If so, verify it before writing code, not after.

The Implementation Plan

The Implementation Plan is the central planning artifact. It should fit on one or two pages, but it should be complete enough for another operator to execute. The brief is not just a summary. It is an instruction-bearing artifact with a clear contract: here is the problem, here is the known context, here are the boundaries, here is the proposed path, and here is how we will know whether the work is done.

# Implementation Plan

## Goal
Implement refresh token rotation without changing the existing login response contract.

## Known facts
- JWT validation happens in src/auth/middleware.ts.
- Session persistence is implemented in src/auth/session-store.ts.
- Existing tests cover login success and expired access tokens.

## Assumptions to verify
- Old refresh tokens are not currently invalidated after rotation.
- Token reuse should be treated as suspicious but not immediately lock the account.

## Open questions
- Should reuse detection emit an audit event?
- Is token family tracking already present in the database schema?

## Context map
- Auth middleware: request validation and session lookup
- Session store: token persistence and expiration
- Test suite: integration tests under tests/auth/

## Work packages
1. Verify current token rotation behavior.
2. Add invalidation logic or token-family tracking.
3. Extend integration tests for old-token reuse.
4. Produce Review.

## Acceptance criteria
- New refresh token is issued on rotation.
- Previous refresh token cannot be reused.
- Existing login response shape is unchanged.
- Tests demonstrate success, expiration, and reuse behavior.

Context maps

A context map tells the model where to look and why. It is not a full dump of file contents. It is a pointer layer: directories, files, functions, commands, external systems, and documents that are likely relevant. Good context maps reduce token use because Claude can read the right files in the right order instead of scanning the repository blindly.

A context map should include both primary and secondary context. Primary context is required to make the change. Secondary context helps review risk, verify behavior, or understand why the system is shaped the way it is.

## Context map
Primary:
- src/auth/middleware.ts — request authentication boundary
- src/auth/session-store.ts — refresh token persistence
- db/schema.sql — session and token tables
- tests/auth/refresh-token.test.ts — integration behavior

Secondary:
- docs/security/auth-model.md — intended auth posture
- .claude/rules/api-security.md — project-specific security rules
- recent PRs touching auth middleware — intent and regression context

Debate review before coding

Before code is written, ask Claude to attack the plan. The goal is not to win the debate; the goal is to improve the artifact. A debate review should look for ambiguous goals, missing constraints, unsupported assumptions, hidden coupling, risky files, weak acceptance criteria, and likely regression paths.

Review this Implementation Plan as a skeptical senior engineer. Identify unsupported assumptions, missing context, and acceptance criteria that would fail to catch a regression. Do not implement. Return a revised brief outline and a list of questions that must be answered before coding.

The output of debate review should be folded back into the brief. If the debate produces useful insights that remain trapped in conversation history, the next operator still cannot use them. Artifact-first planning means the artifact is the durable memory.

What makes a plan reusable downstream?

A reusable plan has clear boundaries. It names the exact goal, the non-goals, the files likely involved, the evidence already gathered, the assumptions still open, and the stop condition. It also includes enough review criteria to prevent the model from declaring success too early.

Goal
What outcome should exist after the work is complete?
Non-goals
What tempting adjacent work should not be done?
Facts
What has been directly observed?
Assumptions
What might be true, but needs verification?
Context map
Where should the next operator look first?
Work packages
What are the smallest implementation units?
Acceptance criteria
What evidence will prove the work is complete?
Review focus
What risks should review emphasize?
Exercise: Take a messy intake request from your team and compress it into an Implementation Plan. Then ask another participant whether they could execute it without reopening the original discussion. Any question they ask is either an open question or a missing fact.

Module recap

Planning in Claude Code is not about slowing down. It is about preserving momentum by producing a compact artifact that can survive compaction, handoff, review, and automation. The better the brief, the less the model has to infer and the easier it is for humans to hold the work accountable.

Module 3 · Block 3 · 60 minutes

Intent Recovery and Dynamic Evidence

A plan tells the model what to do; this module makes sure the plan is built on why the code looks the way it does. When a codebase is old enough, the current code is rarely the whole story. You will recover intent from Git history, issues, PRs, logs, tests, command traces, and screenshots before asking Claude to change behavior.

Module outcome: You create a short Explore findings and Request Changes that distinguish evidence from inference and make the next model turn more accurate.

Hands-on: produce an Explore findings and Request Changes

  1. Use at least two evidence types.
  2. Label evidence, inference, and unknowns.
  3. Rewrite weak feedback into expected vs actual, command output, file pointer, and next ask.

Token-efficient investigation habit

Do not paste the whole investigation trail. Compress it into evidence, inference, unknowns, and pointers to files, lines, commits, commands, or screenshots.

Find the why before the what

Claude can usually explain what code does from the current files. The harder question is why it does that. Was a strange branch added for a customer-specific edge case? Did a test encode a production incident? Was a confusing abstraction introduced to support a migration that has since finished? Current code often hides the reason for its own shape.

Intent recovery is the discipline of gathering enough historical and runtime evidence to avoid undoing deliberate behavior. It is especially important when the requested change appears simple. Simple changes are dangerous when they cut across hidden intent.

Static code is not enough

Reading the current file gives one kind of evidence. Git history gives another. Tests reveal expected behavior. Issues and PRs reveal tradeoffs. Logs reveal runtime reality. Screenshots reveal UI states that code alone may not make obvious. CLI traces reveal exact failure modes. Documentation and MCP-backed systems can provide external context that is not stored in the repository.

Evidence source Question it answers Risk if omitted
Current codeWhat does the system do now?The model may miss hidden coupling outside the local file
Git blame and commitsWhy was this line introduced or changed?The model may remove a deliberate workaround
PR discussionWhat tradeoffs were accepted?The model may re-litigate settled decisions
Issues / ticketsWhat user or incident motivated the behavior?The model may solve the wrong problem
TestsWhat behavior is currently protected?The model may pass local reasoning but break expected behavior
Logs and tracesWhat happens in real executions?The model may optimize for imagined behavior
ScreenshotsWhat does the user actually see?The model may miss visual or state-machine issues
Docs and runbooksWhat standards should govern the change?The model may violate team conventions

Evidence versus inference

A strong Explore findings labels evidence. It does not say “the bug is caused by stale cache” unless there is direct evidence. It says “the failure appears after the cache read path; logs show cache hit with outdated value; no write-through event appears in the trace; inference: stale cache is likely.” This distinction matters because the next model turn will use the memo as context. If guesses are written like facts, the model will build on them.

# Explore Findings
# Builds on the Module 2 Implementation Plan: "refresh token rotation".

## Question
Why is an old refresh token still accepted after rotation issues a new one?

## Evidence gathered
- Reproduces: pnpm test tests/auth/refresh-token.test.ts — old-token reuse returns 200, expected 401.
- Server logs show the session store resolves the previous token after rotation.
- Git history shows rotation added in PR #184 with no invalidation step.
- src/auth/middleware.ts validates tokens but never deletes the prior record.

## Inferences
- Rotation issues a new token but does not invalidate the previous one.
- Invalidation likely belongs in the session store, not the middleware.

## Open questions
- Where should the previous token be invalidated — session store or middleware?
- Should reuse of an old token emit an audit event?

## Recommended next step
Trace refresh-token writes and invalidation in src/auth/session-store.ts before changing code.
Checkpoint: Read your memo back and underline every sentence stated as fact. For each one, can you point to the command, line, or commit that proves it? Anything you cannot point to is an inference wearing a fact’s clothing — label it as such before the model builds on it.

Dynamic evidence sources

Dynamic evidence is information that changes with execution: logs, test runs, database state, browser behavior, CLI output, screenshots, observability traces, and external tool responses. It is powerful because it grounds the model in reality. It is also noisy. A good operator does not paste raw dynamic output indiscriminately. They capture the relevant excerpt, label how it was produced, and explain why it matters.

When collecting dynamic evidence, include command provenance. The model should know not only the output, but the command, environment, timestamp or branch, and whether the result was reproducible.

Command: pnpm test tests/auth/refresh-token.test.ts --runInBand
Branch: refresh-token-rotation
Result: failed, 1 test
Relevant output:
  expected old refresh token reuse to return 401
  received 200
Interpretation:
  Existing implementation issues a new refresh token but does not invalidate the previous token.

Why “that didn’t work” fails as feedback

The phrase “that didn’t work” is almost useless to the model. It omits what was attempted, what was expected, what actually happened, what evidence was observed, and what changed between attempts. Good feedback is a bundle: action, expectation, observation, evidence, hypothesis, and next constraint.

Weak feedback Better feedback
That did not work.After applying the patch, `pnpm test tests/auth/refresh-token.test.ts` still fails. Expected old token reuse to return 401; actual response is 200. Relevant log shows session lookup succeeds for the old token. Focus next on invalidation in session-store, not middleware.
The UI is broken.Clicking Save leaves the modal open. Browser console shows no error. Network tab shows PATCH /settings returns 204. The likely issue is local modal state not closing after success.
Try again.Revise the approach without changing the public API response shape. Preserve existing success tests and add one regression test for duplicate submission.

The Request Changes pattern

A Request Changes is a structured correction that improves the next model turn. It should be short, but it must contain enough evidence for the model to update its plan. Use it whenever Claude’s first implementation fails, when a test result contradicts an assumption, or when a human reviewer spots a gap.

# Request Changes

## Attempted change
Added token rotation logic in src/auth/middleware.ts.

## Expected result
Old refresh token reuse should return 401.

## Actual result
Old refresh token reuse returns 200.

## Evidence
Test: pnpm test tests/auth/refresh-token.test.ts
Failure: expected 401, received 200
Trace: old token still resolves in session-store lookup.

## Updated hypothesis
Middleware is not the right layer for invalidation. The session store accepts both token records.

## Next instruction
Inspect session-store token persistence and invalidation. Do not change response shape.

Module recap

Intent recovery prevents well-intentioned regressions. Dynamic evidence prevents hallucinated debugging. Request Changess convert failure into useful context. Together, they turn Claude Code from a code generator into an evidence-driven collaborator.

Exercise: Choose a bug or confusing behavior from the demo app. Produce a one-page Explore findings with at least three evidence sources and a Request Changes that would help Claude recover from a failed first attempt.

Module 4 · Block 4 · 60 minutes

Review, Test, and Verify

The work is not done when code changes. It is done when the diff has been reviewed against the plan, the verification evidence is explicit, and the residual risk is clear enough for a human to accept or reject.

Module outcome: You produce a Review that makes findings, evidence, scope drift, regression risk, and PR-readiness explicit.

Hands-on: build a Review

  1. Review a diff against the plan.
  2. Name scope drift and regression risk.
  3. List verification evidence and what was not verified.
  4. End with residual risk and PR handoff.

Token-efficient review habit

Review against the plan and the diff first. Expand only when a finding needs more context, and keep verification output to command, result, excerpt, and residual risk.

Review the diff against the plan

A Claude-assisted review should not ask “does this code look good?” That question is too broad and too subjective. The stronger question is: “does this diff satisfy the Implementation Plan without violating constraints or introducing unacceptable risk?” The brief becomes the review contract.

Findings-first review means the reviewer leads with issues, not narrative. Each finding should state severity, evidence, affected file or behavior, why it matters, and a recommended fix. If there are no blocking findings, the review should still describe what was checked and what residual risk remains.

# Review Finding

Severity: Important
Area: Refresh token invalidation
Evidence: tests/auth/refresh-token.test.ts covers successful rotation but not reuse of the old token.
Why it matters: The acceptance criteria require old refresh tokens to be rejected.
Recommended fix: Add a regression test that attempts reuse of the previous token after rotation and expects 401.

Scope drift

Scope drift is any change that is not required by the brief. Some drift is harmless cleanup. Some drift is dangerous because it changes behavior the team did not intend to change. Claude can drift when it sees adjacent improvements, especially if the prompt rewards broad helpfulness. Review must therefore compare the diff against explicit goals and non-goals.

Drift type Example Review response
Benign cleanupRenaming a local variable for clarityAccept if low risk and local
Adjacent refactorChanging session-store interfaces while adding one token behaviorChallenge unless required by the brief
Behavior expansionAdding account lockout on token reuse when not requestedReject or move to follow-up
Contract changeChanging login response shape while implementing rotationBlock
Test-only expansionAdding regression tests for directly related edge casesUsually accept

Testing is one gate, not the only gate

Passing tests are necessary evidence, but they are not proof of correctness. Tests only cover what they assert. A Review should include test results, manual checks when relevant, static review, diff review, command outputs, and residual risk. The point is not to create paperwork. The point is to prevent the phrase “tests pass” from hiding an unreviewed assumption.

For LLM-assisted work, verification should also include provenance: what files changed, what commands were run, what evidence was observed, and what the model did not check. This gives the human reviewer a clear map of confidence and uncertainty.

Checkpoint: When the model says “tests pass,” ask what the tests do not cover. A green suite proves the asserted behavior, not the absent assertion. Name one thing that could break that no current test would catch.

The Review

# Review

## Plan alignment
Goal: Refresh token rotation rejects old token reuse.
Status: Implemented and tested.

## Changed files
- src/auth/session-store.ts — invalidates previous refresh token on rotation
- tests/auth/refresh-token.test.ts — adds old-token reuse regression test

## Evidence
- pnpm test tests/auth/refresh-token.test.ts — passed
- pnpm test tests/auth/login.test.ts — passed
- Manual API check: old refresh token returns 401 after rotation

## Scope review
No public response contract changes observed.
No unrelated auth routes modified.

## Residual risk
Database cleanup of invalidated token records is not addressed. Existing retention behavior remains unchanged.

## PR handoff note
Reviewer should focus on token-store concurrency and whether invalidated token retention meets audit expectations.

Regression risk

Regression risk is not just the probability that something breaks. It is the product of likelihood, blast radius, and detectability. A small likelihood with a huge blast radius still deserves attention. A likely bug with easy rollback may be acceptable if the release path is safe. Claude can help enumerate risks, but the team must decide what risk is acceptable.

Risk question Why it matters
What user-visible behavior changed?Identifies blast radius
What existing tests protect this path?Identifies current safety net
What did we not test?Prevents false confidence
What external systems depend on this behavior?Finds hidden contracts
How would we detect failure in production?Separates known risk from invisible risk
How would we roll back?Determines operational readiness

What makes a handoff PR-ready?

A PR-ready handoff gives the reviewer the shortest path to an informed decision. It should contain the problem statement, brief link or summary, changed files, review focus, verification evidence, known non-goals, and residual risk. The reviewer should not need to reconstruct the story from chat history.

PR-ready handoff formula

Problem → Approach → Changed files → Verification → Review focus → Residual risk. If any of those pieces is missing, the PR is not fully handoff-ready.

Module recap

Review and verification discipline turns model output into engineering evidence. The goal is not to make Claude “sound confident.” The goal is to make the work auditable: what changed, why it changed, how it was checked, and what remains uncertain.

Exercise: Review a provided diff or demo artifact against its Implementation Plan. Produce three findings if problems exist; otherwise produce a Review and residual-risk note that would be acceptable in a PR description.

Module 5 · Block 5 · 45 minutes

Workflow Design and Minimal Improvement Loop

Module 1 introduced workflows as the way to stitch primitives together; this is where you design one deliberately. The session closes by turning the day’s isolated practices into a single named workflow — with handoffs, gates, stop conditions, and one lightweight Workflow retro — so the next run can improve without expanding into a full operating-system redesign.

Module outcome: You leave with a Workflow and one credible next-run improvement checklist or metric.

Hands-on: design one workflow

  1. Name the workflow trigger and roles.
  2. Define artifact handoffs, gates, and stop conditions.
  3. Choose which parts are commands, skills, agents, or human review.
  4. Add one next-run improvement.

Token-efficient workflow habit

Each agent handoff should define what context crosses the boundary, what stays behind, and where compaction happens before the next phase.

From good sessions to repeatable workflows

A strong Claude Code session can still be a dead end if the team cannot repeat it. Workflow design captures the sequence that made the session successful: how the work was framed, what artifacts were produced, which agents or skills were used, where human review occurred, and what evidence counted as done. The workflow does not need to be elaborate. It needs to be named, bounded, and reusable. A named workflow is an asset: you run it again next week, apply it to the next project, and hand it to a new teammate without re-teaching the whole system. That is leverage that compounds.

This module is intentionally compact. The goal is not to design a full engineering operating system — it is to leave with one workflow you can try next week and improve after one run.

A compact multi-agent workflow

Multi-agent workflow design should begin with work boundaries, not agent names. Each agent or subagent should own a distinct lens or phase. If two agents need the same broad context and produce overlapping output, the workflow is probably not decomposed well.

# Workflow

Name: Evidence-first bug fix

Trigger:
A bug report has enough detail to reproduce or investigate.

Artifacts:
1. Explore findings
2. Compact Implementation Plan
3. Implementation diff
4. Review

Roles:
- Explorer subagent: identify relevant files, history, and evidence sources
- Planner: compress evidence into Implementation Plan
- Implementer: make bounded code changes from the brief
- Reviewer: compare diff against brief and produce findings

Gates:
- Do not implement until facts, assumptions, and open questions are separated.
- Do not review until acceptance criteria are explicit.
- Do not hand off until verification evidence and residual risk are written.

Stop condition:
The change is PR-ready or blocked by a named open question.

Handoffs

A handoff is where one operator’s output becomes another operator’s input. In Claude Code workflows, handoffs should be artifact-based. The explorer hands off an Explore findings. The planner hands off an Implementation Plan. The implementer hands off a diff plus notes. The reviewer hands off findings and verification evidence. If a handoff requires the next operator to read the entire chat transcript, the handoff failed.

Handoff Input Output Quality bar
Investigation → PlanningEvidence, traces, history, open questionsImplementation PlanFacts and assumptions are separated
Planning → ImplementationImplementation Plan and context mapBounded diffNon-goals and acceptance criteria are respected
Implementation → ReviewDiff and briefFindings or approval with residual riskReview is evidence-based
Review → HandoffFindings, fixes, verification commandsPR-ready packetReviewer can decide without chat history

Gates and stop conditions

Gates prevent premature motion. Stop conditions prevent infinite motion. A gate says what must be true before the workflow can advance. A stop condition says when the workflow is complete, blocked, or unsafe to continue. Claude workflows need both because models tend to continue helping unless told what “done” means.

Good gates are observable. “Make sure the plan is good” is not a gate. “The brief includes goal, non-goals, facts, assumptions, open questions, context map, work packages, and acceptance criteria” is a gate. Good stop conditions are explicit. “Continue until fixed” is vague. “Stop when the Review shows the acceptance criteria pass, or when an open question blocks safe implementation” is actionable.

Checkpoint: Look at each gate in your workflow and ask whether a new teammate could tell, without you, whether it has been met. If a gate needs your judgment to evaluate, it is a preference, not a gate — rewrite it as something observable.

One lightweight Workflow retro

The improvement loop should be small enough that the team actually uses it. Pick one Workflow retro that can be completed after a workflow run in five minutes. The Workflow retro should measure the workflow, not the model’s personality. It should ask whether artifacts were reusable, whether evidence was sufficient, whether context was controlled, whether review caught issues, and what should change next run.

# Workflow Retro

Workflow name:
Date:
Task:

1. Was the Implementation Plan usable without reopening the original conversation? 0 / 1 / 2
2. Did the Explore findings separate evidence from inference? 0 / 1 / 2
3. Did the implementation stay inside the stated scope? 0 / 1 / 2
4. Did verification include more than passing tests? 0 / 1 / 2
5. Was residual risk explicit? 0 / 1 / 2

One thing to keep:
One thing to change next run:
One artifact or rule to update:

What to improve on the next run

Do not try to improve everything after the first run. Choose one improvement. Maybe the context map was too vague. Maybe the review agent needs a narrower rubric. Maybe the Implementation Plan omitted non-goals. Maybe verification evidence was too thin. The Workflow retro turns that observation into a small change: update a template, add a rule, refine a skill, or adjust a gate.

Closing synthesis

The five modules form one loop. Craft the right primitives. Compress the plan. Recover intent with evidence. Review against the plan. Preserve the workflow that worked. That loop is small enough to teach in a day and strong enough to become the foundation for team-scale Claude Code adoption. The payoff is not abstract: less rework, faster shipping, and a library of reusable assets that make every future run cheaper and more reliable.

Exercise: Name one workflow your team will run in the next week. Fill out the Workflow, define at least three gates, and choose one Workflow retro question that will determine what you improve after the first run.

Appendix: Student Desk Reference and Repo Links

Use this as your one-page desk reference: durable facts go in CLAUDE.md; repeatable procedures become skills or workflows; broad exploration goes to subagents; PR handoffs require evidence.

Core commands and settings

AreaUse thisWhen it matters
Setup and health/doctor, claude --safe-mode, CLAUDE_CODE_SAFE_MODE=1Validate install health or troubleshoot by disabling customizations.
Memory and contextCLAUDE.md, @docs/file.md, .claude/rules/*.md, /memory, /compact focusing on ..., /clearMake important context durable, modular, scoped, inspectable, and cheap to carry forward.
Model selection/model, /model opusplan, /effort low|medium|high|xhigh|max, /fast, fallbackModel, --fallback-modelUse deeper reasoning where mistakes are expensive; use faster/cheaper paths for bounded work.
Delegation/agents, claude agents, project .claude/agents/, background subagentsKeep broad exploration isolated and return concise findings to the main session.
Review and cleanup/code-review high, /code-review --fix, /simplify, REVIEW.mdReview against the plan, make risk explicit, and clean up before handoff.
GovernanceavailableModels, enforceAvailableModels, requiredMinimumVersion, Tool(specifier) permission rules, disableBundledSkillsKeep teams on approved models, versions, tools, and extension surfaces.

Context pointers

Primary repo links

Concrete Examples, File Locations, and Repo Links

Use this section while practicing: each reusable Claude Code concept below includes the location where you would store it in a project and a repo link that backs the concept here.

Tools and permissions

Tools are the model action surface: reading, searching, editing, running commands, fetching context, and calling integrations. The course examples emphasize matching tool access to task risk rather than allowing everything by default.

Course backing doc

docs/TOOL-PERMISSIONS-EXAMPLES.md

Safe exploration, controlled implementation, shared repo guardrails, and workflow-specific permission posture.

Where this lives

settings.json, managed settings, permission dialogs, MCP/plugin policies, and task-specific approval choices.

Permission rules use the Tool(specifier) form, for example Bash(npm run test:*) or WebFetch(domain:example.com).

Permission prompt pattern:
What must Claude read?
What may Claude write?
What requires approval?
What would be dangerous if Claude guessed?
What evidence is required before widening permissions?

Try it

  1. Choose one live task.
  2. Write allowed reads, allowed writes, and approval-required actions.
  3. Compare your posture to the safe exploration and controlled implementation examples.

Commands

Commands are for short, prompt-shaped, directly invoked operations. Use them when the behavior repeats but does not need a full method, bundled assets, or a specialist role.

Course backing doc

docs/PLUGINS-SKILLS-COMMANDS-AND-MODELS.md

Includes command examples such as /review-pr-risk, /summarize-failing-test, and /draft-pr-body.

Where this lives

Common project pattern: .claude/commands/<name>.md.

Built-ins appear in the slash menu, such as /model, /agents, /mcp, /permissions, and /compact.

# .claude/commands/summarize-failing-test.md
Summarize the failing test evidence in this shape:
1. Command run
2. First failing assertion or error line
3. Relevant file and line pointer
4. Likely failure category
5. Next narrow read or command

Do not propose a fix until the failure category is grounded in evidence.

Skills

Skills are for repeatable methods with structure: required inputs, context gathering, workflow, output artifact, verification checklist, and safety rules.

Skill starter template

templates/skill-template.md

The template defines the minimum sections students should fill in for a first-pass skill.

Where this lives

Common project pattern: .claude/skills/<skill-name>/SKILL.md.

Use skills for procedures and reusable artifact production, not always-on project facts.

# .claude/skills/flaky-test-investigation/SKILL.md
# Skill: flaky-test-investigation

## Purpose
Investigate a flaky test using evidence before proposing a fix.

## Required inputs
- failing command
- test name or file
- relevant CI/local output

## Workflow
1. Capture exact command and failure excerpt
2. Classify the failure mode
3. Identify dynamic evidence needed
4. Produce an Explore findings
5. Propose the next narrow action

## Outputs
A short Explore findings with evidence, inference, and next step.

Agents and subagents

Agents and subagents are bounded workers with a mission, explicit tools, output shape, and stop condition. Use them when you need separate context or specialist review.

Course backing doc

docs/PLUGINS-SKILLS-COMMANDS-AND-MODELS.md

Defines agent and subagent usage, including context isolation and bounded missions.

Where this lives

Common project pattern: .claude/agents/<agent-name>.md.

Use /agents or claude agents to manage sessions where supported.

# .claude/agents/security-reviewer.md
---
name: security-reviewer
description: Review code changes for security vulnerabilities. Use proactively.
tools: Read, Grep, Glob, Bash
model: sonnet
maxTurns: 10
---

You are a security specialist. For every code change:
1. Check for injection vulnerabilities
2. Verify input validation at system boundaries
3. Check for exposed secrets or API keys
4. Verify authentication and authorization checks

Report findings by severity. Do not edit files unless explicitly asked.

Plugins and marketplace evaluation

Plugins are a distribution abstraction. A plugin can package multiple reusable units such as commands, subagents, MCP servers, hooks, skills, or workflow assets. Evaluate a plugin like dependency surface area, not like a shortcut.

Course backing doc

Marketplace evaluation checklist

Use the checklist before installing or promoting a package.

Where this appears

Use /plugin flows and /plugin list where available.

Prefer local commands or skills when the behavior is still small or unstable.

Demo artifacts and anti-patterns

Setup and supporting files

Token Efficiency Throughout the Course

Student goal: Learn to spend context where it changes the outcome, not where it merely makes the session feel busy.

Token efficiency is not a separate optimization topic. It is the connective tissue across primitive design, planning, investigation, review, and workflow design — and it is money: tokens are cost and context is speed, so disciplined context means lower cost-to-serve and more work shipped per hour. The habits below should show up in every exercise.

Route narrowly

Start with the smallest useful abstraction: direct prompt, command, skill, subagent, agent, then plugin only when the reuse unit truly deserves packaging.

Search before read

Use file lists, grep results, line ranges, and exact excerpts before asking Claude to ingest full files or logs.

Delegate exploration

Broad search belongs in a subagent with isolated context and bounded output: finding, pointer, confidence, and next action.

Compact between phases

Use guided /compact after planning, investigation, or review so decisions survive while exploration noise falls away.

Module-by-module token habits

ModuleToken-efficient behaviorStudent artifact
Primitive Build LabChoose the smallest primitive that controls context and reuse.Primitive kit with primitive, permissions, model, output, stop condition.
PlanningAdd a context budget: must load, can defer, delegate, do not load, preserve.Compact Implementation Plan with context map.
InvestigationCompress evidence into claims with source pointers instead of transcripts.Explore findings and Request Changes.
ReviewReview the diff against the plan before expanding context.Findings-first Review.
WorkflowPut context gates between roles and agents.Workflow with handoff limits and compaction points.

Token-efficient prompt patterns

Search first. Return only matching file paths and line numbers. Do not read full files yet.

Read only <file> lines <start>-<end>. Summarize the relevance in 3 bullets.

Delegate to a subagent. Return only: finding, evidence pointer, confidence, next action. Limit to 10 bullets.

/compact focusing on decisions made, evidence pointers, files touched, unresolved questions, and residual risk.

Review this diff against the compact plan. Return findings only with severity, file pointer, and suggested next action.

Hands-on: token budget your current task

  1. Write what Claude must load.
  2. Write what can be deferred.
  3. Write what should be delegated to a subagent.
  4. Write what should not be loaded.
  5. Write what must survive compaction.

Alignment With the 15-Minute Modules

Tone and stance: The short reader modules are practical, operational, and opinionated. They do not teach Claude Code as feature trivia. They teach reusable engineering leverage: better context, lower token burn, and better first-pass code.

Three ideas worth carrying through the whole day

Give Claude better context

Persistent memory, targeted rules, subagent delegation, and guided compaction reduce repeated instructions and keep the session focused.

Burn fewer tokens for more impact

Use RTK, pointers, search-before-read, skills, and compaction so the team spends context where it changes the outcome.

Land better code the first time

Route by model, effort, speed mode, and review mechanism. Treat /code-review, /simplify, and REVIEW.md as quality controls.

Standardize the patterns

The organizational win comes from encoding good behavior in CLAUDE.md, REVIEW.md, commands, skills, agents, hooks, and workflow artifacts.

Coverage checklist

Reader pointStudent-facing follow-along actionConcrete backing
Memory layersDecide what belongs in managed, user, project, and local memory.CLAUDE.md, ~/.claude/CLAUDE.md, CLAUDE.local.md
Auto-memoryWrite one durable correction Claude should not need to be told again./memory and readable markdown memory files
Path-scoped rulesDraft one rule that should load only for a file path pattern..claude/rules/*.md
@ importsPlan one shared standard that should be imported instead of pasted.@docs/coding-standards.md
SubagentsDelegate a broad search and return only findings, pointers, confidence, and next action..claude/agents/*.md, /agents
Custom agentsReview a named specialist with frontmatter, tools, model, max turns, and severity output.Security reviewer example in the deck
Guided compactionWrite a /compact focusing on... prompt after a phase boundary./compact vs /clear
RTK and measurementName noisy commands worth compressing and how savings would be measured.rtk gain, rtk gain --history, rtk discover
Pointers over full textReplace one long file/log dump with path, line range, exact excerpt, and next command.Search-before-read exercise
SkillsPromote one repeated prompt into a skill with inputs, workflow, outputs, verification, safety.templates/skill-template.md
Model selectionChoose model and effort by job, not habit./model, /model opusplan, /effort
Fast modeName one situation where latency is worth the per-token tradeoff./fast
Review and simplifyRun correctness review before cleanup, then verify residual risk./code-review, /code-review --fix, /simplify, REVIEW.md
Multi-model workflowPlan, implement, explore, review, simplify, and fast-iterate using the right control for each phase.Workflow design module and desk reference

Concrete examples to keep open

Token Savings Field Guide

Student goal: Use token efficiency as an engineering operating habit: search first, point precisely, delegate noisy work, compact at phase boundaries, and turn repeated work into reusable artifacts.

Token savings are not about making Claude think less. They are about keeping the context window focused on the material that changes the outcome. The tips below should show up during every module, not only during the token-efficiency portion of the day.

1. Enable RTK for noisy command output

RTK is useful when shell output is large, repetitive, or noisy. The impact compounds because command output does not just cost tokens once; it remains in the session and can be re-read on later turns.

rtk gain              # cumulative token savings this session
rtk gain --history    # per-command breakdown with savings
rtk discover          # find missed compression opportunities

Good RTK targets:
- test output
- build logs
- git status and diff noise
- package manager output
- long formatter or type-checker traces
Practice habit: If a command regularly emits hundreds or thousands of lines, compress it, script it, or summarize it before it enters the main Claude context.

2. Move repeated tasks to executable scripts

Repeated prompts are expensive because they require the user to re-describe a workflow and require Claude to re-infer what should happen. A stable script is cheaper and more reliable.

Repeated prompt patternToken-efficient replacement
“Run the usual checks.”scripts/check-pr-ready.sh or npm run check:pr
“Do the full release validation we always do.”scripts/release-validate.sh plus a skill that explains when to use it.
“Look for the standard security issues.”A security-reviewer subagent plus a compact severity output contract.
“Please remember our migration checklist.”A skill or command checked into the repo, with verification and safety rules.
# Example: scripts/check-pr-ready.sh
#!/usr/bin/env bash
set -euo pipefail
npm run lint
npm run typecheck
npm test
git status --short

# Prompt Claude:
# Run scripts/check-pr-ready.sh and summarize only failures,
# evidence pointers, and the next command to run.

3. Do not say “explore” without intention

Broad words such as “explore,” “look around,” “understand this,” or “check the repo” often cause broad reads. That may be appropriate for onboarding, but it should be intentional.

Token-heavy promptBetter prompt
Explore the auth system.Find the entry points for JWT validation. Return only file paths, line numbers, and a one-sentence role for each. Do not read full files yet.
Look through the tests.Search for tests that mention refresh tokens. Return matching files and test names only.
Understand why this broke.Use git history, failing test output, and the changed files. Return claims with evidence pointers and confidence.
Review this whole PR.Review the diff against the Implementation Plan. Return blocking findings first, each with severity and file pointer.

4. Use permission posture as a productivity lever, not a safety shortcut

Permissions shape token efficiency because unnecessary approval prompts interrupt loops, but broad permissions can create risk. Start with read-heavy, write-light permissions, then widen only when the task and verification path are clear.

{
  "permissions": {
    "defaultMode": "bypassPermissions"
  }
}
Important: Treat bypassPermissions as a sandbox-only accelerator for trusted, disposable training repos or isolated environments. Do not use it as the default posture in production-adjacent repos, repos with secrets, deployment scripts, CI mutation, migrations, or broad network access. Prefer auto mode or scoped allow rules when available.

5. Enable and curate memory

Memory saves tokens when it prevents repeated corrections. It wastes tokens when it becomes stale, vague, or bloated.

6. Share context and investigation across tasks

The cheapest context is the context already summarized well. Do not make the next task rediscover what the last task proved.

Reusable artifactWhat it should preserveWhy it saves tokens
Explore findingsClaim, evidence pointer, confidence, open question.Future tasks read findings instead of raw history.
Request ChangesWhat failed, exact evidence, correction request, desired output.Prevents “that didn’t work” follow-up churn.
ReviewFindings, severity, diff pointer, residual risk.Review stays anchored to facts instead of re-reading everything.
ReviewCommands run, outputs summarized, unverified areas.Next operator knows what is proven and what is not.
Workflow handoffRole, input artifact, output artifact, gate, stop condition.Agents and humans can continue without reopening the whole problem.

7. Bound subagent output

Subagents are powerful because they can absorb noisy exploration without polluting the main context. That benefit disappears if the subagent returns a transcript-sized report.

Delegate this investigation to a subagent.

Scope: auth middleware and token refresh only.
Tools: Read, Grep, Glob, Bash for git history.
Do not edit files.

Return exactly:
1. Findings, max 8 bullets
2. Evidence pointers: file:line or commit SHA
3. Confidence: high/medium/low
4. Recommended next action

8. Prompt with a token budget

Return paths only. Do not read files yet.

Read 40 lines around the match, not the whole file.

Summarize logs into failures, evidence, and next command.

Cap the answer at 10 bullets unless a blocking risk requires more.

/compact focusing on decisions made, evidence pointers, files touched,
unresolved questions, and residual risk.

9. End-of-task token checklist

Before asking Claude to workBefore continuing the session
Have I named the artifact I want?Did I save decisions into an artifact?
Have I bounded files, paths, commands, and output length?Did I compress noisy evidence into pointers?
Have I chosen command, skill, script, or subagent?Did I update memory, CLAUDE.md, or a skill if needed?
Have I set the right permission posture?Should I use guided /compact before the next task?

Hands-on: make one task 50% cheaper

  1. Pick one broad prompt from your current workflow.
  2. Rewrite it with a target, search boundary, output contract, and token budget.
  3. Move any repeated command sequence into a script or skill.
  4. Decide whether noisy discovery belongs in a subagent.
  5. Write the compaction prompt you will use after the task.