# Beyond Instructions: How Beads Lets AI Agents Build Like Engineers

Tom Brewer
Table of Contents

These notes are based on the YouTube video by AI Tinkerers


Key Takeaways

  • Beads adds memory & task management to coding agents – agents can keep state across sessions, pick up where they left off, and tackle larger, multi‑week projects.
  • “Land the plane” is a community‑adopted cleanup protocol that forces the agent to close out the current session (commit, sync, delete stale branches, update the issue tracker, suggest the next prompt).
  • Tracer‑bullet technique – start with the thinnest possible end‑to‑end slice of functionality to get the agent moving again when it gets stuck.
  • Multimodal prompting (screenshots / video) dramatically improves UI‑centric debugging; Playwright is used to capture screenshots that become the agent’s source‑of‑truth. Learn how to make your codebase ready for this approach in Making Codebases Agent Ready – Eno Reyes, Factory AI.
  • Junior developers can thrive by learning the right prompting questions, treating agents as collaborators, and mastering the pre‑flight checklist (tests pass, code committed, etc.). The five core techniques they should master are outlined in The 5 Techniques Separating Top Agentic Engineers Right Now.
  • Beads stores work in Git‑backed issue objects that have unique hash‑based IDs, parent/child links, priorities, and dependency graphs – essentially a “Google Maps” for your current work.
  • Works with any LLM‑based coding agent that can invoke a CLI (e.g., Codex, Cloud Code, AMP, Claude Code, Gemini CLI, etc.) and can be installed via npm (if published) or directly from the GitHub repo. See a deeper dive in Beyond Instructions: How Beads Lets AI Agents Build Like Engineers.
  • Cost efficiency – a single feature fix can cost under $15 of compute in a typical example, far cheaper than a week of manual engineering.
  • Rapid community adoption – the project has attracted a growing community with several thousand users and dozens of contributors, and integrations are already demonstrated for Graphite, Oracle (multi‑model review), and custom MCP mail protocols. Read about the impact on shipping speed in Ship Production Software in Minutes, Not Months — Eno Reyes, Factory.

Important Concepts

1. Beads Framework

  • Purpose: Give autonomous coding agents a persistent memory and a task‑management layer so they can run for many hours and handle complex, inter‑dependent work.
  • Core components:
    • Beads DB – a lightweight local SQLite cache that speeds up queries; the source of truth lives in JSONL files committed to Git.
    • Issue objects – each piece of work becomes a Beads issue with fields such as id (hash‑based, e.g., bd‑a1b2), title, status, priority, parent, children, dependencies, epic.
    • Session protocol – agents read the current issue list at start, write updates during the session, and (optionally) execute the land‑the‑plane script at the end.
  • Memory across sessions: When a new session starts, the agent loads the latest issue state from the JSONL records, giving it context without having to embed the whole history in the prompt.

2. “Land the Plane” Cleanup Protocol

A scripted sequence that many teams run automatically when they tell the agent “let’s land the plane.” It is not a built‑in Beads command, but a widely‑used convention that performs:

  1. Update Beads issue tracker (status, comments, next‑step suggestion).
  2. Sync Git – push commits, pull latest, resolve any merge conflicts.
  3. Prune workspace – delete stale branches, drop local stashes, remove temporary debugging files.
  4. Generate next‑session prompt – the agent suggests the most logical next issue to work on.

Because agents love checklists, this deterministic routine gives them a reliable “completion” signal and helps prevent context drift.

3. Tracer‑Bullet Technique

  • Goal: Get an agent unstuck by narrowing the problem to the smallest viable end‑to‑end slice.
  • Steps:
    1. Identify the core functionality needed to prove the concept (e.g., a telnet protocol stub instead of full graphics).
    2. Prompt the agent to implement just that slice.
    3. Once the slice works, iteratively expand outward, re‑using the working artifact as a new context anchor.

This mirrors the “thin client” strategy used in early web development and is especially useful when the agent repeatedly loops without progress.

4. Multimodal Prompting

  • Screenshots: Playwright captures a UI screenshot after each build; the image is fed back to the LLM as part of the prompt, allowing the agent to compare against a reference image.
  • Video (future): Steve suggested a 10‑15 s video capture to validate dynamic UI behavior (e.g., spell animation, map flicker). The agent would ingest the video frames as a multimodal context.

5. Junior Engineer Playbook

Checklist ItemWhy It Matters
Are tests passing?Guarantees a stable baseline before the agent starts mutating code.
All changes committed?Prevents loss of work when the agent restarts.
Issue ready in Beads?Gives the agent a concrete, addressable goal.
Ask “Are you done?” → expect “no”Forces the agent to justify completion; reduces premature “finished” signals.
Run code‑review promptAgents are better at critiquing than generating; a review often surfaces hidden bugs.

Detailed Explanations

Installing & Bootstrapping Beads

Terminal window
# From within your project directory (e.g., a Codex container)
# If the package is published to npm:
npm install beads # installs the CLI locally
# Otherwise, install from the GitHub repo:
git clone https://github.com/beads-dev/beads.git
cd beads
npm install -g . # installs the `bd` command globally

After installation, the typical workflow is:

Terminal window
# 1. Create a new issue (or let the agent do it)
bd new "Port React client to web" --priority P1
# 2. Start an agent session, pointing it at the issue ID
amp run --issue 42 # example command for the AMP agent

The agent will read the issue, generate code, and update the issue with progress notes and a screenshot link.

Example: Porting a Legacy Game Client

  • Context: A 30‑year‑old Java/Kotlin game with Android, iOS, Steam clients. Goal: build a single React web client.
  • Workflow:
    1. Reference screenshot of the legacy UI is added to the repo.
    2. Agent receives prompt: “Make the React client look like the reference screenshot.”
    3. Playwright runs the built client, captures a screenshot, and feeds it back to the LLM.
    4. Agent iterates: layout → theming → fonts → spacing, each iteration validated against the reference image.
    5. When a flickering bug appears, the tracer‑bullet approach is used: first create a minimal “static view” of the inventory, then re‑introduce dynamic effects.

Result: a functional React client built in ~1 week of wall‑clock time, with a $14 compute bill (≈ $15).

Multimodal Prompt Example (Pseudo‑prompt)

<image: screenshot_current.png>
<video: gameplay_clip.mp4>
You are reviewing the UI for the spell‑casting screen.
- Verify that spell icons are rendered in full color.
- Ensure no flickering occurs during animation.
- If any discrepancy, output the line number and suggested CSS fix.

The LLM can reason over both visual modalities and output concrete code changes.

Beads Issue Graph

Epic: "Unified Game Client"
├─ Issue #101 (P1) – Build React scaffolding
├─ Issue #102 (P1) – Implement inventory view
├─ Issue #103 (P2) – Add spell icon rendering
└─ Issue #104 (P3) – Fix map flicker (depends on #103)
  • Parent/child links let the agent understand ordering.
  • Priority boost: recently closed issues are auto‑promoted for a short window, preventing “orphaned” bugs.

🔗 See Also: Beyond Instructions: How Beads Lets AI Agents Build Like Engineers
💡 Related: The 5 Techniques Separating Top Agentic Engineers Right Now

Graphite & Oracle Integration

  • Graphite: an AI‑assisted code‑review service that the agent can invoke automatically after each commit.
  • Oracle: a multi‑model “second opinion” service (e.g., a newer GPT model) that the agent calls for design or architectural questions, reducing hallucination risk.

Both are triggered via Beads hooks, e.g.:

Terminal window
bd hook add --event post-commit --action "graphite review"
bd hook add --event pre-merge --action "oracle check"

These hooks are optional; teams can wire any external tool that can be called from the command line.


Summary

Beads is an open‑source framework that transforms LLM‑powered coding agents from single‑shot script runners into persistent collaborators. By giving agents a Git‑backed issue tracker, a lightweight SQLite cache for fast queries, and a deterministic “land‑the‑plane” cleanup routine (adopted by the community), developers can orchestrate long‑running, multi‑agent projects—such as porting a legacy game client to React—in a matter of days and for a fraction of the traditional cost.

Key practices include:

  • Using tracer bullets to break dead‑ends into minimal working slices.
  • Leveraging multimodal prompts (screenshots, future video capture) for UI‑heavy debugging.
  • Treating junior engineers as prompt engineers: mastering pre‑flight checks, iterative code review, and clear “are you done?” questioning.

The growing community, flexible hook system, and low operational cost suggest that Beads (and the broader “agentic coding” paradigm) will become a core productivity layer for software teams within the next year.

Tom Brewer Avatar

Thanks for reading my notes! Feel free to check out my other notes or contact me via the social links in the footer.

# Frequently Asked Questions

What is the Beads framework and how does it give coding agents persistent memory?

Beads is an open‑source framework that adds a lightweight SQLite‑backed database and Git‑stored JSONL issue objects to any LLM‑based coding agent. By loading the latest issue list at the start of a session, the agent regains context without having to embed the entire history in the prompt, enabling it to continue work across days or weeks.

How do I run the “land the plane” cleanup protocol after an agent finishes a task?

The protocol is a community‑adopted script, not a built‑in Beads command. After telling the agent to “land the plane,” the script updates the Beads issue tracker, pushes and pulls the latest Git changes, prunes stale branches and temporary files, and finally generates a next‑session prompt that suggests the most logical next issue.

What is the tracer‑bullet technique and when should I use it?

The tracer‑bullet technique means starting with the thinnest possible end‑to‑end slice of functionality—just enough code to prove the workflow works. Use it whenever the agent gets stuck on a large feature; the minimal slice gets the agent moving again, surfaces hidden dependencies, and lets you iteratively expand the implementation.

How can I prepare my codebase for multimodal prompting with screenshots or video?

Integrate a tool like Playwright to capture UI screenshots that the agent can treat as source‑of‑truth, and store those assets in a predictable folder structure referenced in the issue description. Ensure the repository includes a script that can replay the UI steps, so the agent can compare the live UI against the captured images and generate precise debugging prompts.

Are there cost or security concerns when using Beads with various LLM coding agents?

Beads itself is inexpensive—most feature fixes run under $15 of compute because the framework reuses context instead of re‑prompting the whole history. Security‑wise, all state is stored locally in SQLite and committed as JSONL files, so you control what data leaves the machine; just avoid committing secrets to the issue objects and use Git‑encrypted branches for sensitive projects.

Continue Reading