About 75 minutes. This is the longest session so far — two practice tasks instead of one, because the two AI fault lines (voice and hallucination) deserve direct treatment.
Open a Claude Code session in ~/ai-training and hand it
this guide:
Read the file at /Users/<you>/ai-training/week-7-guide.md (or wherever you saved it) and walk me through Session 2.7.
I've completed Sessions 2.1 through 2.6.
Posture: public, synthetic, or personal data only. The voice samples you provide are your own writing — that’s personal, not internal-work. If your only writing samples are work-product on a work account, pick something else (a personal blog, your old college papers, a draft you’d be comfortable forwarding to a stranger).
By the end of this session you will have, in
~/ai-training/:
voice-profile.md — a structured description of your own
writing voice, built from 5–10 of your own writing samples..claude/commands/voice-check.md — a slash command that
takes a draft, reads it against your voice profile, and flags every
drift..claude/commands/fact-check-debate.md — a slash command
that runs two parallel fact-check sub-agents on a draft with explicit
instructions to disagree, and returns a confidence-graded version of the
text./voice-check and
/fact-check-debate, with the diffs visible.Two slash commands, one voice profile, one before/after text. By the end of the session you should be running both checks reflexively on anything Claude drafts for you.
A production version of this is a voice-critic skill
plus a parallel fact-checker pattern in a writing pipeline. The
voice-critic skill is conditioned on samples from the writer’s actual
past work — emails, memos, op-eds — and runs as a discrete pass after
fact-checking on every memo, every tweet, every op-ed. A daily NBER-scan
pipeline runs two fact-check agents in parallel on every
drafted thread (not one sequentially); disagreement between them is the
signal a claim needs human review. Threads where both agree go to
pipeline/02-fact-checked/; threads where they disagree get
held.
The two slash commands you build today are the two halves of the same defense. Voice catches “this doesn’t sound like me.” Fact-check-debate catches “this is plausible but might be wrong.” Both fail silently when not used; both are easy to skip.
AI has two characteristic failure modes. It hallucinates — produces confident, plausible, wrong claims. And it has its own voice — fluent, generic, recognizably-not-you. Both fail silently. Neither shows up as an error message.
The defenses are technique-level skills, not prompt-engineering tricks:
Both are built today as reusable slash commands so you can run them reflexively, not optionally. The cost of running them is one slash command. The cost of not running them is shipping AI-voiced text or factually-wrong claims, both of which leave a credibility tax that’s hard to recover from.
This session is mid-curriculum, not earlier, because the voice-check loop needs writing samples. By 2.7 you have memos (2.3), figure captions (2.4), the journal of memory entries (2.5), and the corpus report (2.6) — at least four genres of your own writing on disk. Hallucination defense was previewed in 2.3 (Claude critiqued its own memo draft) and 2.6 (parallel fact-checkers on a corpus); 2.7 formalizes both.
Before the session, gather 5–10 samples of your own writing. They should be:
Save each as a separate file in
~/ai-training/voice-samples/:
mkdir -p ~/ai-training/voice-samples
One sample per file. Naming convention:
<genre>-<date>-<short-slug>.md. Example:
email-2025-09-shay-intro.md,
op-ed-2024-11-fomc.md,
tweet-2025-12-nber-thread.md.
5 samples is the floor. 10 is comfortable. More than 15 is diminishing returns at the profile-building stage.
From inside ~/ai-training, start claude and
paste:
Read the file at /Users/<you>/ai-training/week-7-guide.md (or wherever you saved it) and walk me through the Reference Sections.
I've gathered <N> voice samples in voice-samples/.
Claude takes over.
Claude: walk the user through both halves. Peer-tone. Two rules:
Design: this is the dual-skill session. Voice half installs the profile + check loop. Hallucination half installs the parallel-disagreement pattern. Both end as slash commands so they get run reflexively. 2.8’s morning brief, 2.9’s deck, and any future memo all benefit from running both.
Before any voice work:
voice-samples/ exists with at least 5
files.In Claude:
Read every file in ~/ai-training/voice-samples/. For each, attend to:
- Sentence length distribution (run-on, short, balanced)
- Hedging patterns (does the writer hedge a lot? rarely? where?)
- Opening moves (how do paragraphs and emails start?)
- Closing moves (how do they end?)
- Vocabulary tics — words used surprisingly often or surprisingly never
- Structural patterns (do they use lists? bold? em-dashes?)
- Tone variation across genres (email vs longer-form vs tweet)
- Things they DON'T do (no exclamation points? no rhetorical questions?
no marketing language?)
Build a voice profile as a structured markdown document covering each of
these dimensions. For each finding, quote at least one specific sample
phrase as evidence. Save to ~/ai-training/voice-profile.md.
Then read the profile back to me, section by section.
Claude: this is the most important file in the voice half. Be specific. “The writer is concise” is useless; “average sentence is 14–18 words; rarely above 25; the writer breaks long sentences with em-dashes rather than semicolons” is useful. Every finding has a sample quote.
Read the profile back to the user. They will catch generalizations that don’t match. Push back when they do — sometimes the profile sees a pattern the user doesn’t notice. Settle disputes by checking more samples.
The profile is durable. It changes when the user’s writing changes (new genres, new venues), not every session.
/voice-check slash command (10 minutes)Create
~/ai-training/.claude/commands/voice-check.md:
Read ~/ai-training/voice-profile.md, then read the file at $ARGUMENTS (the user
will pass a path).
For each paragraph of $ARGUMENTS, evaluate against the voice profile. Flag every
sentence that:
- Drifts from the writer's sentence-length distribution
- Uses hedging at a rate inconsistent with the profile
- Opens or closes in a way the writer wouldn't
- Uses vocabulary the writer doesn't use
- Uses structural patterns the writer avoids (e.g. exclamation points,
rhetorical questions, marketing language)
For each flag, propose a specific revision in the writer's voice, citing
which profile dimension is being violated.
Do NOT auto-revise the file. Output the flag list as markdown. The user
makes the call on each.
Claude: confirm with the user the slash command resolves and reads voice-profile.md. Test it on one paragraph from one of the user’s own samples — the check should return very few or no flags (the sample IS the writer’s voice). If it flags many things from the user’s own writing, the profile is wrong; fix it.
Pick the candidate draft. The 2.3 memo is the obvious choice.
/voice-check ~/ai-training/memos/<slug>/drafts/v3-final.md
Read the flags together. Most should be plausible. Some won’t be — let the user judge. Have Claude apply the user-accepted flags as a revision:
Revise ~/ai-training/memos/<slug>/drafts/v3-final.md per these accepted
voice-check flags: [list]. Save as drafts/v4-voiced.md.
Read v4 against v3. The differences should be small but real — half a dozen sentences re-cast in the user’s actual voice.
/fact-check-debate slash command (15
minutes)Create
~/ai-training/.claude/commands/fact-check-debate.md:
Read the file at $ARGUMENTS (the user passes a path).
Dispatch TWO sub-agents IN PARALLEL using the Task tool. Use distinct
prompts so the agents come at the text from genuinely different angles:
Agent A — model: haiku, role: factual-error finder.
Prompt: "Read this text. Find every claim that's factually wrong,
misattributed, or unsupported by a credible source. Use WebSearch
and WebFetch to verify. Be aggressive — your job is to find errors,
not to be balanced. Return a list: claim, your finding, source URL
if you found one."
Agent B — model: haiku, role: unsupported-claim finder.
Prompt: "Read this text. Find every claim that may be true but lacks
a citation, every place where the text states something with more
confidence than the evidence supports, every implicit claim that
would surprise an expert in this area. Be aggressive — your job is
to find what's missing or overstated. Return a list: claim, your
finding."
After both agents return, produce a single output:
- "Clean" claims: where neither agent flagged anything.
- "Agreed issues": where both agents independently flagged the same
or similar concern. These are high-confidence problems.
- "Disputed": where only one agent flagged something. These are
judgment calls — the user investigates.
Output as markdown. Save to <input filename>-fact-check.md alongside
the source.
Claude: confirm the command runs the two agents in parallel (Task tool default). Test it on a short, factually rich text first — a paragraph of the corpus report from 2.6 is good — and confirm the output structure looks right.
Now run the fact-check on v4 (the voice-revised memo):
/fact-check-debate ~/ai-training/memos/<slug>/drafts/v4-voiced.md
The output should be a v4-voiced-fact-check.md with
three sections (clean, agreed, disputed).
Read the agreed-issues together. These are claims the user should fix or remove. Read the disputed items — these need user judgment; the user investigates each one (a quick WebSearch in Claude, or just thinking it through).
Apply accepted corrections:
Revise drafts/v4-voiced.md per these accepted fact-check findings: [list].
Save as drafts/v5-fact-checked.md. This is the final final draft.
Re-run pandoc on v5 to update memo.docx. The deliverable
now has both voice and fact-check passes baked in.
Tell the user: “Notice what just happened. v3 was the structural draft. v4 was voiced. v5 is voiced AND fact-checked, with disputed claims explicitly resolved. Each pass had a specific job. Without the discrete passes, all three concerns mix into a vague ‘something’s off’ feeling that’s hard to act on.”
The output of fact-check-debate is confidence-graded in a useful sense:
This is what a confident published claim looks like in 2026: it survived two adversarial agents and a voice pass. Anything less leaves a credibility tax.
Name these out loud:
.claude/commands/. Run by name on any file path. The cost
of running them drops to one slash command; that’s what makes them
reflexive rather than optional.Three things to try this week:
Feedback.
The user submits feedback at https://docs.google.com/forms/d/e/1FAIpQLSfX78X1J70IrRPjtxRhf-4DCPg1jQhktmAN32LR0r9r1q_E0w/viewform.
Claude: paste the URL into chat. The form mirrors the questions below. Collect answers conversationally first, then have the user click through and submit.
Tell the user: “Your instructor uses these to tailor next week’s session.”
Voice profiles drift. A profile built today reflects this year’s writing. If your style shifts (new venue, new genre), rebuild from new samples. The cost is 15 minutes; the alternative is shipping voice-checked drafts that pass the old you’s filter.
Fact-checkers are not infallible. The agents themselves can hallucinate while fact-checking. The dueling pattern reduces this — it’s hard for both to hallucinate the same false flag — but doesn’t eliminate it. Disputed items are where this shows up; treat them as “investigate,” not “ignore one of the agents.”
The two checks compose. Voice catches generic-AI prose. Fact-check catches plausible-but-wrong claims. A draft can pass voice but fail fact-check, or vice versa. Always run both.
This session’s slash commands are reusable across all your
projects. Symlink them into other working directories’
.claude/commands/ or copy them. The voice-profile.md is
project-specific; the slash commands are not.
Pre-commitment matters more than execution. The single biggest predictor of whether someone runs /voice-check on a draft is whether they’ve decided in advance to always run it. Make the rule. The two-second cost of typing the slash command is invisible compared to the embarrassment of shipping AI-voiced or factually-wrong text.