Session 2.4: Make a plot from a spreadsheet — and revise it

About 60 minutes. Open a Claude Code session in ~/ai-training and hand it this guide:

Read the file at /Users/<you>/ai-training/week-4-guide.md (or wherever you saved it) and walk me through Session 2.4.
I've completed Sessions 2.1 through 2.3.

Posture: public, synthetic, or personal data only. Today’s data is FRED (Federal Reserve Economic Data) — public, downloadable, well-documented. If you’d rather use a different public dataset (Our World in Data, BLS, World Bank, a public survey), substitute freely; the workflow is the same.


Practice task

By the end of this session you will have, inside ~/ai-training/exhibits/<topic-slug>/:

  1. data/raw.csv — a public CSV pulled from FRED, with a known SHA-256 hash recorded in data/manifest.txt.
  2. make_plot.py — a small matplotlib script Claude wrote with you that turns the CSV into a figure.
  3. figure-v1.png — the first cut.
  4. figure-v2.png — second cut, with NBER recession bars added.
  5. figure-v3.png — third cut, with your style applied (palette, font, aspect ratio).
  6. Three new lines in ~/ai-training/MEMORY.md capturing your figure preferences.

Three figures, one script, one tagged dataset, three remembered preferences. The figures are the visible output; the memory entries and the script are what make the next plot 5 minutes instead of 45.

A production version of this is a figures session that pulled survey response data from an Excel file and produced exhibits constrained by preferences from a previous session: 4:3 aspect ratio so two figures fit on a slide, Times New Roman 12pt black labels, no gridlines, single-select questions plotted as pie charts and multi-select as horizontal bar charts. The preferences had been written to MEMORY.md after a prior session, so the figures came out right on the first try — no re-explaining the house style. By the end of this session you’ll have written down enough of your own preferences that the next figure starts where this one ended.


Why iterate explicitly on figures

The trap in figure-making: you produce one chart that’s almost-right, eyeball it, ship it. Three chart sessions later you’ve spent four hours on chart aesthetics across three documents and no two figures look the same.

The unlock is the same as Session 2.3’s memo iteration, applied to images:

  1. Versioned figures. figure-v1.png, figure-v2.png, figure-v3.png. The intermediate cuts are cheap — saving them is free; recovering one when v3 went wrong is invaluable.
  2. A reusable script. make_plot.py takes the CSV and produces the figure in one command. Re-running with new data is one line.
  3. Style as durable memory. Once you decide “4:3, Times New Roman 12pt, no gridlines, this palette,” that decision goes in MEMORY.md once. Every future plot starts there.

The figure is the artifact. The script and the memory entries are what make the next figure cheap.


Step 1: Set up matplotlib (3 minutes)

If matplotlib and pandas aren’t installed:

pip3 install --user pandas matplotlib

Verify:

python3 -c "import pandas, matplotlib; print(pandas.__version__, matplotlib.__version__)"

If pip3 install --user puts binaries somewhere not on PATH, that’s fine for libraries — python3 -c "import pandas" is the only check that matters.


Step 2: Hand Claude this guide

From inside ~/ai-training, start claude and paste:

Read the file at /Users/<you>/ai-training/week-4-guide.md (or wherever you saved it) and walk me through the Reference Sections.
matplotlib and pandas are installed.

Claude takes over.


Reference Sections

Claude: walk the user through pulling a CSV, verifying it, plotting it, and iterating. Peer-tone. Two rules:

  1. Always verify the data before plotting. Hash the CSV, record row counts, check the date range. The “verify the data” step is non-negotiable — figures built on wrong data are worse than no figures.
  2. Write the script as a real file, not as one-shot Python in a Claude tool call. The script is part of the deliverable. Re-running has to be one command.

Design: this session installs three habits — verify-then-plot, script-as-deliverable, preferences-to-MEMORY. Sessions 2.6 (corpus scoring) and 2.9 (PowerPoint deck) reuse the figures produced here. The 4:3 default is set today specifically because 2.9 pairs two figures per slide.


Confirm the setup

Before any data work:

  1. python3 -c "import pandas, matplotlib" runs cleanly.
  2. ~/ai-training/MEMORY.md exists. If it doesn’t, create it now with one section heading ## Figure preferences and an empty body — you’ll write into it during Step E.

Step A — Pull the data (5 minutes)

The default dataset: U.S. unemployment rate, last 10 years, monthly, from FRED. Series ID UNRATE. Public, no auth, downloadable as CSV directly.

In Claude:

Create the directory exhibits/unemployment-rate/data/. Download the FRED
series UNRATE as CSV from
https://fred.stlouisfed.org/graph/fredgraph.csv?id=UNRATE
to exhibits/unemployment-rate/data/raw.csv. After downloading, record:

  - SHA-256 hash of the file
  - row count
  - the first and last date in the file
  - the column names

to exhibits/unemployment-rate/data/manifest.txt.

Claude: use a shell tool to curl the URL, then a small Python snippet to compute the hash and read the date range. Don’t trust the file blindly — read the first 5 lines so the user can see the format. Confirm with the user that the data looks right (date column, value column) before moving on.

If the user wants a different dataset (BLS unemployment by state, OWID life-expectancy, World Bank GDP, a public survey CSV), substitute. The pattern is identical.


Step B — Write make_plot.py (10 minutes)

Build the script with the user, in exhibits/unemployment-rate/make_plot.py:

import pandas as pd
import matplotlib.pyplot as plt
from pathlib import Path

HERE = Path(__file__).parent

def load():
    df = pd.read_csv(HERE / "data" / "raw.csv", parse_dates=[0])
    df.columns = ["date", "rate"]
    return df

def plot_v1(df, out):
    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(df["date"], df["rate"])
    ax.set_xlabel("Date")
    ax.set_ylabel("Unemployment rate (%)")
    ax.set_title("U.S. Unemployment Rate")
    fig.tight_layout()
    fig.savefig(out, dpi=150)

if __name__ == "__main__":
    df = load()
    plot_v1(df, HERE / "figure-v1.png")

Run:

python3 exhibits/unemployment-rate/make_plot.py
open exhibits/unemployment-rate/figure-v1.png

Read the figure together. It’s serviceable but plain. Common things to notice: aspect ratio is wrong for slides, no recession context, default fonts and colors. Make a list of what to fix.


Step C — Iterate to v2 (recession bars) (10 minutes)

NBER maintains the official U.S. recession dating. Public dates list at https://www.nber.org/research/business-cycle-dating. For the last 10 years, the relevant recession is February 2020 → April 2020 (COVID).

Add a plot_v2 function:

RECESSIONS = [
    ("2020-02-01", "2020-04-01"),
]

def plot_v2(df, out):
    fig, ax = plt.subplots(figsize=(8, 6))
    ax.plot(df["date"], df["rate"], color="#1f4e79", linewidth=1.5)
    for start, end in RECESSIONS:
        ax.axvspan(pd.Timestamp(start), pd.Timestamp(end), color="#cccccc", alpha=0.5)
    ax.set_xlabel("Date")
    ax.set_ylabel("Unemployment rate (%)")
    ax.set_title("U.S. Unemployment Rate, with NBER recession shading")
    fig.tight_layout()
    fig.savefig(out, dpi=150)

Update the __main__:

plot_v2(df, HERE / "figure-v2.png")

Re-run, open figure-v2.png. The grey band is the recession.

Tell the user: “The recession bar is one tiny annotation that turns ‘a line going up’ into ‘a line going up because of a known event.’ Most figures gain a lot from one piece of structural context.”


Step D — Iterate to v3 (style) (15 minutes)

Now the user picks their style. This is the part that matters for the rest of the curriculum: whatever the user decides today is what every future figure starts from.

Walk them through three choices:

  1. Aspect ratio. Default figsize=(8, 6) is 4:3. That’s the right default for slide decks (two figures fit per slide). 16:9 fills a single full slide. 3:2 is good for memos. Pick one — 4:3 is the recommended default unless the user knows they want otherwise.
  2. Font. Pick a serif (Times New Roman, Georgia) or sans-serif (Helvetica, Arial) and a size (11–14pt). Set with plt.rcParams.update({"font.family": "Times New Roman", "font.size": 12}) at the top of the file.
  3. Palette. Pick a small set — 1–4 colors that work together. The default #1f4e79 (deep blue) is fine; if the user wants their own, set a list. Hex codes only; no named colors.

Add plot_v3:

plt.rcParams.update({
    "font.family": "Times New Roman",
    "font.size": 12,
    "axes.spines.top": False,
    "axes.spines.right": False,
    "axes.grid": False,
})

PALETTE = ["#1f4e79", "#c00000", "#7f7f7f"]

def plot_v3(df, out):
    fig, ax = plt.subplots(figsize=(8, 6))  # 4:3
    ax.plot(df["date"], df["rate"], color=PALETTE[0], linewidth=1.5)
    for start, end in RECESSIONS:
        ax.axvspan(pd.Timestamp(start), pd.Timestamp(end), color=PALETTE[2], alpha=0.3)
    ax.set_xlabel("Date")
    ax.set_ylabel("Unemployment rate (%)")
    ax.set_title("U.S. Unemployment Rate")
    fig.tight_layout()
    fig.savefig(out, dpi=150)

Run, open. The figure should now look like a figure the user would put their name on.

Iterate as needed. If the user wants the title removed, the y-axis to start at 0, the line thicker, the recession bar a different shade — do it. Each change is one line of Python; the user is steering, Claude is typing.


Step E — Write preferences to MEMORY.md (5 minutes)

This is the durable part. Open ~/ai-training/MEMORY.md and add — Claude writes, user reviews:

## Figure preferences

- Aspect ratio: 4:3 (figsize=(8, 6)) by default. Two-figure slides fit at this size.
  Override to 16:9 (figsize=(10, 5.625)) for full-slide solo figures, 3:2 for memos.
- Font: Times New Roman 12pt. Black labels.
- Palette: ["#1f4e79", "#c00000", "#7f7f7f"] (deep blue primary, red accent, grey for shading).
- Spines: top + right removed. No gridlines.
- Recession shading: NBER dates, grey at alpha 0.3.
- DPI: 150 for saved PNGs.
- Title: short, sentence case, ≤8 words.

Tell the user: “Edit any of those to your taste. From now on, when we make a figure, I’ll read this section first. You’re not picking colors twice.”


Micro-skills introduced

Name these out loud:


Wrapping up Session 2.4

Three things to try this week:

  1. Make one more figure. Pick a different FRED series (CPI, S&P 500, your favorite) or any public CSV (BLS, OWID, World Bank). Reuse make_plot.py as a starting point. Notice how much faster v1 lands now that style preferences exist.
  2. Refine the MEMORY.md preferences. Edit them after seeing the second figure. The first version is rarely the final one.
  3. Build a figure for a real (public) topic in your domain. If your running example is regulatory filings, find a chart someone else has published, pull the underlying public data, and rebuild it in your style. Notice what they got right and what you’d change.

Feedback.

The user submits feedback at https://docs.google.com/forms/d/e/1FAIpQLSfBc9jULaJzQWTtAM8dKVCS51JIupf7SVybxCVitWp9aCreyg/viewform.

Claude: paste the URL into chat. The form mirrors the questions below. Collect answers conversationally first, then have the user click through and submit.

  1. On a 1–5 scale, how useful did this session feel?
  2. The verify-before-plot step — natural, or does it feel like ceremony for a CSV you just downloaded?
  3. The script-as-deliverable pattern — does it change how you think about plotting, or did it feel like extra typing?
  4. Did writing preferences to MEMORY.md feel like a real handoff to future-you, or did it feel like documentation theater?
  5. Which of the three iteration steps (recession bars, style) added the most value to the figure?
  6. What confused you most this session?
  7. Anything you want covered in Session 2.5 that you didn’t see here?

Tell the user: “Your instructor uses these to tailor next week’s session.”


Good to know

Most figures don’t need fancy charting libraries. Matplotlib does 90% of what people use plotly, seaborn, and ggplot for. Don’t reach for a heavier library until you’ve hit a wall with matplotlib’s defaults plus rcParams.

tight_layout() solves most layout issues. Labels cut off, titles overlapping, weird whitespace — fig.tight_layout() before savefig fixes most of it. Reach for bbox_inches="tight" in savefig if tight_layout isn’t enough.

The data should never be regenerated by hand. Once data/raw.csv is downloaded and hashed, treat it as immutable. If the data changes, write data/raw-2026-05-15.csv and update the manifest. Editing CSVs by hand is the most common source of figure errors that pass review and embarrass you later.

Pandas is overkill for tiny data and exactly right for everything else. A 100-row CSV could be parsed with csv.reader. Don’t bother — pd.read_csv is the same line of code and stays right when the data grows.

Excel files work too. pd.read_excel(path, sheet_name=...) reads .xlsx directly. For .xlsx files with structure that matters (multi-sheet, merged cells, Exhibit Two on Sheet 3), openpyxl lets you treat the workbook as data rather than copy values manually. Treat the spreadsheet as a source, not a manual-copy target.