…it really feels like they’re attempting to reinvent a project tracker and start...

threecheese · 2025-11-29T03:19:01 1764386341

This is actually very interesting I think, as Anthropic pushes against The Bitter Lesson a bit! The model is a great reasoner, but we still need a concrete way to manage tasks - like we needed for tool calling. Claude Code has an opinionated loop, something like ReAct/CoT etc with prompting tricks for tasks/skills/etc, but here they add a Hierarchical Controller/Worker thing leveraging the Claude SDK. Mixing agency with actual control using program logic - not just alignment using prompts screaming in all caps and emoji.

We are going to break out of the coding agent’s loop in this way - it’s sorta curving back around to Workflows, after leaving them behind for agency, but right now we need to orchestrate this with deterministic code written mostly by humans - like the git repo anthropic shared. This won’t last long.

imron · 2025-11-28T22:11:52 1764367912

This was my take.

They’ve made an issue tracker out of json files and a text file.

Why not hook an mcp to an actual issue tracker?

_boffin_ · 2025-11-28T23:04:42 1764371082

Used an LLM to help write the following up as I’m still pretty scattered about the idea and on mobile.

——

Something I’ve been going over in my head:

I used to work in a pretty strict Pivotal XP shop. PM ran the team like a conductor. We had analysts, QA, leads, seniors. Inceptions for new features were long, sometimes heated sessions with PM + Analyst + QA + Lead + a couple of seniors. Out of that you’d get:

- Thinly sliced epics and tasks - Clear ownership - Everyone aligned on data flows and boundaries - Specs, requirements, and acceptance criteria nailed at both high- and mid-level

At the end, everyone knew what was talking to what, what “done” meant, and where the edges were.

What I’m thinking about now is basically that process, but agentized and wired into the tooling:

- Any ticket is an entry point into a graph, not just a blob of text. - Epics ↔ tasks ↔ subtasks - Linked specs / decisions / notes - Files and PRs that touched the same areas

- Standards live as versioned docs, not just a random Agents.md:

  - Markdown (with diagrams) that declares where it applies: tags, ticket types, modules.
  - Tickets can pin those docs via labels/tags/links.

- From the agent’s perspective, the UI is just a viewer/editor. - The real surface is an API: “given this ticket, type, module, and tags, give me all applicable standards, related work, and code history.”

- The agent then plays something like the analyst + senior engineer role: - Pulls in the right standards automatically - Proposes acceptance criteria and subtasks - Explains why a file looks the way it does by walking past tickets / PRs / decisions

So it’s less “LLM stapled to an issue tracker” and more “that old XP inception + thin-slice discipline, encoded as a graph the agent can actually reason over.”

beefnugs · 2025-11-29T02:56:35 1764384995

Has any project tried forcing a planning layer as //TODO all throughout the code before making any changes? small loops like one //TODO at a time? What about limiting changes to a function at a time to remain focused? Or is everyone a slave to however the model was designed and currently they are designed for giant one-shot generations only?

Is it possible that all local models need to be better is more context used to make simpler smaller changes at a time? I haven't seen enough specific comparisons of how local models fail vs the expensive cloud models.

adamgordonbell · 2025-11-29T03:32:21 1764387141

I did find beads helpful for some of this multi-context window tasks. It sounds a little like there is some convergence between what they are suggesting and how it give you light weight sub tasks that survive a /clear.

_boffin_ · 2025-11-29T19:04:59 1764443099

Love the show!

> It sounds a little like there is some convergence between what they are suggesting and how it give you light weight sub tasks that survive a /clear.

I do see the convergence there. Beads gives you that "state that survives `/clear`," and Anthropic’s harness tries to do something similar at a higher level.

I've been thinking about this with a pretty simple, old-school analogy:

You're at a shop with solid engineering and ticketing practices. You just hired a great junior developer. They know the stack, maybe even the domain basics, but they don't yet know:

- Your business processes

- The quirks of your microservices

- Local naming conventions, standards, etc.

- Team norms around testing, logging, and observability

You trust them with important tasks, but expect their context will frequently get blown away by interruptions, meetings, task-switching, and long weekends. T handle this, need to make sure each ticket or note contains enough structured info so that when they inevitably lose context, they can pick right back up.

For each ticket, you'd likely include:

- Personas and user goals

- Acceptance criteria, Given/When/Then scenarios

- Links to specs, documentation, related tickets, or prior art

- A short summary of their current understanding

- Rough plan (steps, what's done/not done)

- Decisions made and their rationale ("I chose X because Y")

- Open questions or known gotchas

End of day Friday, that junior would ideally leave notes that answer: "If I have total amnesia next Tuesday, what's the minimum needed to quickly reload my context?"

To me, agent harnesses like Anthropic's or Beads are just formalizing exactly this pattern:

- `/clear` or `/new` is like a "long weekend brain wipe."

- Persistent subtasks or controllers become structured scaffolding.

- The crucial piece isn't remembering everything, just clearly capturing intent, decisions, rationale, and immediate next steps.

My confusion about Anthropic’s approach is why they're doing this over plain text files or JSON, instead of leveraging decades of existing tracker and project-management tooling—which already encode this exact workflow and best practice.

tomwojcik · 2025-11-28T22:10:44 1764367844

Did you mean plane.so instead of plane.io?

imron · 2025-11-28T22:12:33 1764367953

I assumed they meant https://github.com/makeplane/plane

_boffin_ · 2025-11-28T22:32:04 1764369124

Correct. My bad