Back to Blog

A common answer: they won't. Apps are legacy UI. Agents will generate everything dynamically. Software collapses into APIs, data, and models.

That framing isn't wrong, but it's incomplete.

My working hypothesis: agents don't replace apps. Apps are save points for agents.

This may not hold. But it's been productive for reasoning about what's changing, what isn't, and where the real constraints live.

Start with a world where cost isn't the constraint

Before agents, we lacked more than confidence in custom software—we lacked resources. Building, maintaining, and coordinating bespoke systems required scarce humans, time, and money. Standardized apps emerged to economize on those constraints.

Agents change that. They move us toward a world where labor is cheap, creation is fast, and exploration is free.

Assume that world. You can spin up as many capable workers as you want. Code, workflows, interfaces—all cheap to generate. Now we move to the harder question: what constraints still bind?

If apps persist even there, they aren't artifacts of cost. They're solving a different problem.

From "apps vs agents" to "traversal vs commitment"

Reframe the relationship by separating two functions that get conflated:

  1. Traversal: exploring possibilities, synthesizing information, adapting to context.

  2. Commitment: stabilizing meaning, assigning responsibility, making something official.

Agents excel at traversal. Apps excel at commitment.

Hence the save point analogy. In a game, you could replay everything from the beginning every time. You don't. You rely on save points or places where progress has been validated and you can safely resume.

Agents move through ambiguity. Apps mark validated states. This is about where uncertainty collapses, not UI.

The framework that made this click: a 2Ă—2 crossing ambiguity with liability. Low ambiguity and low liability tasks are ripe for full automation. High ambiguity and low liability work is where agents roam freely. Low ambiguity and high liability tasks become app territory (structured workflows with audit trails). High ambiguity and high liability situations demand human judgment with heavy documentation.

2x2 matrix showing ambiguity vs liability framework for understanding where agents vs apps excel

The key insight here is that liability tolerance gates automation more than intelligence does.

Why we still use building blocks we could recreate from scratch

The clearest illustration isn't about apps. It's more fundamental: why do we rely on shared building blocks like libraries, templates, standards, reusable components when we could rebuild them every time?

You're working on a document, a dataset, or a piece of code. You could derive every formula from first principles. You could rewrite every function. You could regenerate every component on demand. Agents make this technically possible.

But that's not where the real cost lives.

When you use a well-established building block like a software library, a contract template, a standard operating procedure, you inherit a validated abstraction: design decisions pressure-tested across thousands of use cases, edge cases discovered through real-world deployment, conventions that compose cleanly with the ecosystem, stability over time, and social trust.

Lets push the thought experiment further. Agents write all the building blocks. Humans are no longer the primary authors. Does that make those building blocks obsolete?

No. What changes is not the need for stable abstractions, but where trust anchors. The save point moves from "a human wrote this" to "this has been validated in a way we trust."

Even in an agent-generated world, semantic stability matters (silent drift is deadly), shared conventions matter (coordination beats novelty), traceability matters (which version produced this result?), and legibility matters (can I explain this to someone else?).

The moment something becomes reusable, shared, or relied upon, it wants to stabilize. It wants to become a save point.

Here's the asymmetry: agents compress the cost of creation faster than they compress the cost of validation. Building blocks exist to manage that gap. So do apps.

The counter-argument from first principles

Any framework that feels too neat deserves challenge. So let me argue against my own idea.

Abstractions exist to reduce cognitive load, enable coordination, and improve efficiency under constraints. Historically, apps and building blocks did all three but in a world where humans were the bottleneck. Humans wrote code, verified results, coordinated meaning, and paid the cost of mistakes.

If agents change those constraints fundamentally, frameworks built around them may not hold. The question is not "what role did apps play?" but "are the constraints that justified apps still binding in an agent-dominant world?"

The save-point idea assumes stability has intrinsic value, that freezing meaning, behavior, or workflows is desirable. But if agents are cheap, computation is cheap, verification is cheap, and recomputation is cheap then why want stability over continuous regeneration?

From this view, apps may be technical debt, not capital.

The strongest critique may be that save points matter only when being wrong is expensive. If agents make being wrong cheap, why freeze anything?

The liability framing can be turned against itself. Liability is socially constructed and therefore malleable. Aviation moved from pilot judgment to automation. Finance moved from human traders to algorithmic execution. Manufacturing moved from artisan inspection to statistical process control. In each case, responsibility shifted from humans to systems.

If agents become more reliable than humans, more consistent, and more auditable through logs, traces, and proofs, organizations may accept machine legitimacy where we currently reserve it for humans. Apps survive not because liability is immutable but because we haven't trusted machines enough. That may be temporary.

The save-point framing may smuggle human preferences in as universal constraints. This is the most uncomfortable critique. Humans like stability, legibility, checkpoints, named artifacts, and responsibility boundaries. Agents don't. If work becomes non-human-first, software optimized for human verification may lose centrality—even if humans remain politically or legally "in the loop."

The Claude Code moment

The strongest test case comes from watching how agent systems like Claude Code work. Especially when looking at how it executes a long-running, multi-agent task.

What emerges here is interesting. The problem with multi-agent systems like this isn't intelligence. It's coordination. Let agents talk freely and you get communication bottlenecks, duplicate work, and drift. Systems that work at scale converge on a different architecture. They do careful planning upfront, with workers deliberately oblivious to the bigger picture.

The pattern:

  1. Planners explore and decompose problems into structured task breakdowns.

  2. Subplanners handle specific categories.

  3. Workers execute assigned subtasks without coordinating with each other (they don't need to, because the plan has done the coordination work).

  4. A judge evaluates progress at each cycle's end and determines whether work continues.

This enables parallelization and recursive loops. It enables long-horizon work—systems running for hours or days, producing hundreds of thousands of lines of code across thousands of files.

Here's what matters for the save-point question: even in systems designed for maximum agent autonomy, persistent state and checkpoints do essential work. The system does not rely on continuous shared context. Too little structure and agents conflict, duplicate work, and drift. Too much structure creates fragility. The solution isn't eliminating structure but rather finding the right save points.

What are those save points? The plan. The task decomposition. The codebase. The judge's verdict. Each cycle leaves concrete outputs that future work can safely build on.

A save point is not an app. A save point is a stabilized representation of progress that enables parallel work without drift.

Planning externalizes structure so agents don't need to coordinate internally. That is exactly the function apps and building blocks serve for humans. Humans don't coordinate continuously; apps encode workflows and constraints; humans operate locally inside those constraints; shared artifacts prevent drift. The same pattern holds for agents. The plans encode task structure and boundaries; workers operate locally inside those constraints; shared artifacts prevent drift.

Save points reduce coordination cost, limit context explosion, and stabilize meaning across time. Planning acts as a save point—procedural rather than semantic, but structurally identical.

Where this framework might break

If this framing is wrong, we should observe it breaking.

A clear falsification would be widespread adoption of agent-generated, ephemeral building blocks recreated on demand, differing across runs, trusted for reuse without stable versioning, shared conventions, or human-legible guarantees. If organizations rely on components whose semantics are not fixed, whose provenance is opaque, and whose behavior is validated only transiently by other agents—and this scales beyond trivial or low-stakes use—then the save-point idea collapses.

In that world, validation has become cheap, and stability is no longer a binding constraint.

Short of that, continued reliance on versioning, reproducibility norms, and ecosystem-level governance suggests agents are changing who creates abstractions, not why those abstractions need to stabilize.

Where I'm landing

After working through all of this, I reach a narrower conclusion.

The disagreement is about where trust lives, not about apps versus agents. If trust remains socially anchored, apps persist. If trust becomes computationally anchored, apps dissolve. If trust hybridizes, we get something new.

The save-point framework may describe a transition state, not an end state. That's fine. Frameworks need not be permanent to be useful. They need to show what's binding now and what might stop binding later.

Apps aren't dead. Their role is narrowing and clarifying. They're becoming less about interaction and more about validation, less about doing work and more about making work count.

Agents feel like labor. Apps feel like capital. Productivity gains come not from eliminating one or the other, but from improving how efficiently we move between them.

Agents don't eliminate save points. They force save points to become more explicit, more minimal, and more purpose-built for coordination rather than interaction.

Or more cleanly: save points are not a legacy of human software. They are a requirement of any system that wants to scale parallel work without drift whether that be human or agent.

The question going forward isn't "will save points exist?"

It's: what do save points look like when agents—not humans—are the primary workers?