How OOP Helped Me Understand AI Agents

I first encountered the term “agent” more than 20 years ago, when I was working on an agent-based modeling system for simulating infrastructure inter-dependencies. Imagine an agent representing a power plant that has gone offline and an agent representing a telecommunications end office switching itself to battery backup and tracking how many simulation cycles the batteries will last.

We called them “intelligent agents” but they were really just POCO C# objects, so all of their “intelligence” was built into deterministic C# code. It was a fairly ambitious research project. We weren’t sure how it would turn out but it ended up working fairly well.

Object-oriented programming (OOP) wasn’t natural to me. I had studied functional and procedural programming in college. OOP took a while for me to wrap my head around. Once I did, I had the classic “a-ha” moment and never looked back.

OOP has a long history, dating back to at least the 1960s with Simula. As it matured, concepts like messaging were introduced in languages like Smalltalk. C++ added OO to C. Java came along as a cross-platform runtime that hid the surgical ugliness of C++. More languages (Ruby, Python, etc.) kept emerging with OOP capabilities. Regardless of language, OOP is now at least familiar, even if it may not always be preferred in the current market.

The concept of “agents” has re-emerged in a different context. There has been a noticeable increase, over the past year or so, in how casually the term AI agent gets used. Depending on the context, agents are described as coworkers, copilots, junior analysts, or autonomous decision‑makers. Sometimes, they may be positioned as the next workforce. In others, they will be framed as a clean break from how software has been built up to now.

In practice, I have found that AI agents map surprisingly well to OOP concepts, brought forward into a mostly probabilistic runtime environment. This framing has helped me capitalize on my previous OOP experience as I’ve been orienting my thinking to designing for AI agents.

Armusaofficial, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

Familiar Patterns

There is a temptation to tell a tidy story about object‑oriented programming. Systems got big. Procedural code got messy. OOP arrived to save us.

That story has never been very satisfying. What seems more durable, looking back, is not the origin myth but the set of instincts OOP reinforced over time. Bundling state and behavior. Drawing boundaries around responsibility. Hiding implementation details behind stable interfaces. Composing behavior from smaller parts rather than trying to predict every future subtype.

None of this eliminated complexity. What it did was make complexity easier to manage by packaging it into smaller, more comprehensible pieces, with clearer seams and fewer hidden assumptions.

Agent systems are encountering a similar pressure today. Large language models are undeniably powerful, but power on its own tends to amplify existing weaknesses in the surrounding system.

Consider a retrieval-augmented generation system where data ownership and authority were never clearly defined. A language model, given broad access, begins to synthesize across sources that were intentionally developed independently.

Because language models place heavy emphasis on similarity, overlap in language or structure can be treated as signal. Separate sources that were never meant to corroborate one another start to reinforce each other implicitly in the output. The result sounds coherent, even helpful, but it reflects an inferred agreement no single source was intended to assert.

Once synthesis is allowed to operate this way, other weaknesses surface quickly. For example, an agent may quietly blend policy guidance with outdated procedural documentation and could begin routing decisions or recommendations down paths that don’t look obviously wrong. The failure is not in the model’s judgment, but in the absence of structure. The system never required questions of authority, scope, or corroboration to be answered before synthesis was allowed to happen.

What an Agent Is (and Isn’t)

In object‑oriented programming, there is a useful distinction that often fades into the background once you have been writing software for a while. Classes define behavior. Objects carry state. But neither of them does anything on its own. Execution only happens once they are placed inside a runtime.

That runtime environment (virtual machine, scheduler, call stack) is easy to forget precisely because it is always there. It is the substrate that makes method dispatch, state mutation, and control flow possible, without ever deciding what should happen.

AI agents follow a similar pattern. The system prompt, tool schemas, behavioral constraints, and policies describe shape and capability. They outline what could happen. On their own, they are descriptive, not operational.

Nothing actually happens until that definition is instantiated inside a running system. Context is attached, state begins to accumulate, and an execution loop takes over. At that point, the underlying model plays a role closer to a runtime than to an object: it enables reasoning and dispatch, but it does not own identity, authority, or responsibility.

This is where discussions of agents often blur lines. Prompts are treated as if they are the agent. Models are treated as if they are decision‑makers. In practice, both are closer to infrastructure. They make action possible, but they do not define what actions are legitimate.

Agency, such as it is, emerges only when these definitions, state, model, and loop are combined inside a system that has been explicit about what questions are allowed to be answered, and which ones are not.

State, Identity, and Lifecycle

Once instantiated, the object-oriented parallels become more concrete. An object has identity. It exists as a distinct thing in a system, with a defined role and a scope of responsibility. An agent is similar. It has an identity that is not just conceptual, but operational: permissions, context, and an implicit contract about what it is meant to do.

KellyC.5366, CC BY 4.0 https://creativecommons.org/licenses/by/4.0, via Wikimedia Commons

Objects also carry state. That state is what makes them useful over time. Agents do as well, even if the details are still being worked out in practice. Conversation history, working context, references into external stores, cached results from prior actions. All of this functions as state, regardless of where it happens to live.

Objects have lifecycles. They are created, used, and eventually discarded. Agents follow the same pattern. They are instantiated, invoked, paused, resumed, and terminated. They persist across steps, not just across function calls, and they exist within systems that are responsible for managing that persistence.

The most obvious difference is determinism. Objects behave predictably. Agents do not. But this is less alien than it first appears. Distributed systems, external services, and human input have always introduced uncertainty. We learned long ago to design software that tolerates variability rather than pretending it does not exist.

Seen in that light, probabilistic behavior is not necessarily a break from prior practice, but it does change the tempo at which uncertainty enters the system. When variability is introduced at computational speed rather than at human pace, it becomes easier for small ambiguities to compound, and harder to observe, trap, and correct them after the fact. Observability may take precedence over error-handling in this kind of environment.

Tools As Interfaces

Tool calling is often presented as a step change in capability. In practice, it looks much closer to something software engineers have been doing for a long time.

A method encapsulates behavior behind an interface. It exposes what can be done without exposing how it is done. Tools serve the same purpose. An agent does not need to know how a database query is executed, how a file is written, or how an API call is routed. It only needs to know that a capability exists, how to invoke it, and what kind of result to expect.

What has shifted is not the abstraction, but the locus of control. Instead of explicit branching logic deciding when a method is called, that decision is deferred to inference and planning. The system still supplies the interface. The agent supplies the choice.

This is where familiar design concerns reassert themselves. Interfaces that are under-specified, overly permissive, or leaky behave no better in agent systems than they do in traditional software. If anything, they are exercised more aggressively.

Poorly designed tool interfaces do not usually fail loudly. They fail by encouraging ambiguous outcomes. When a tool returns imprecise or weakly constrained results, a similarity‑seeking model may treat that ambiguity as signal. A slightly-off response can steer the next reasoning step toward a different downstream branch or tool selection, not because it is correct, but because it appears compatible.

In a more deterministic system, that kind of mismatch might trigger an error or halt execution. In an agent workflow, it can quietly redirect the process. The system continues to run, producing coherent output, even as it moves further away from the path the designer intended. Observability becomes a crucial way to notice that the workflow has gone off course.

Boundaries Still Matter

One reliable way to create unstable agent systems is to give an agent too much surface area. This usually happens incrementally. Another tool is added for convenience. A bit more memory is exposed to improve recall. Authority expands slightly to cover an edge case.

Over time, the agent stops being a clearly bounded component and starts to look like a general-purpose problem solver. At that point, familiar failure modes reappear.

Object-oriented developers have seen this before. The god object rarely fails immediately. It appears capable. It absorbs responsibility. And when it does fail, it fails opaquely, because too much logic and state have been allowed to accumulate behind a single interface.

Well-designed agents tend to be constrained specialists. They retrieve data, or reason over context, or synthesize output, or validate results. They may collaborate, but they do not attempt to do everything themselves.

Encapsulation is not a holdover from older software paradigms. In systems built on probabilistic components, it is one of the primary tools we have for keeping behavior legible over time.

Composing Behavior

Modern object-oriented design learned, often through experience rather than theory, that deep inheritance hierarchies are fragile. They tend to encode assumptions that are difficult to unwind once a system starts to evolve.

A starling murmuration at Prestwick by Walter Baxter, CC BY-SA 2.0 https://creativecommons.org/licenses/by-sa/2.0, via Wikimedia Commons

Composition offered a different path. Instead of building tall hierarchies of types, systems could be assembled from smaller components with well-defined responsibilities.

State and behavior stayed closer together, but relationships between components became more explicit. Objects collaborated through interfaces rather than inheriting behavior implicitly. Over time, this made it easier to change one part of a system without having to reason about every part that depended on it.

A simple HR system is a good illustration. The inheritance-heavy version ends up with a type ladder like Employee → FullTimeEmployee → Manager → SeniorManager, plus parallel branches for Contractor, Intern, and Temp, each overriding slightly different rules for time off, benefits, approvals, and payroll.

A composition-first version keeps a smaller core—an EmployeeRecord—and attaches capabilities as components: PayrollPolicy, BenefitsPlan, TimeOffPolicy, and ApprovalChain. You change behavior by swapping policies, not by inventing a new subtype. Refactors became possible. Boundaries became visible. Systems could evolve without requiring everything above them to be rewritten.

Agent architectures are arriving at a similar place. Multi-agent systems tend to work best when agents are small, focused, and composed into explicit workflows. One agent gathers context. Another reasons over it. Another formats output. Another evaluates the result. 

Coordination happens through messages and shared state rather than through implicit hierarchy. The result is not necessarily simpler, but it is easier to reason about. When something goes wrong, there are clearer seams to inspect.

The Execution Loop

In most object-oriented systems, nothing happens unless something calls into an object. Execution is driven from the outside, whether by a user action, a scheduler, or another part of the system.

Agent systems follow a similar pattern, but the loop is more explicit. An agent observes the current state of the world, decides what to do next, takes an action, and incorporates the result before repeating the cycle.

A simple example is a support triage agent. It polls an inbox or queue, classifies a new ticket, looks up related history, decides whether to respond automatically or escalate, takes that action, and then waits for the next event. The model contributes judgment at each step, but the loop itself—poll, decide, act, repeat—is what gives the system continuity.

Or consider a monitoring agent watching infrastructure metrics. On each pass through the loop it evaluates recent signals, decides whether they cross a threshold worth acting on, triggers an alert or remediation step, and records the outcome. Nothing about that behavior is intrinsic to the model. It emerges because the system keeps re‑entering the same cycle with updated state.

That loop may be implemented in different ways. It may be tight or loosely coupled. It may be event-driven or time-based. But without it, there is no agency to speak of—only definitions waiting to be exercised.

This is an important distinction, because it is easy to attribute agency to the model itself. In practice, the model supplies reasoning at each step, but the surrounding system supplies continuity. It decides when the agent runs, what context it sees, what actions are available, and when the loop ends.

When people talk about autonomous agents, they are usually describing this execution loop and the degree of freedom it has been given, not the intelligence of the model in isolation.

Limits of the Analogy

No analogy holds indefinitely, and this one is no exception. But the place where it stops mapping cleanly is also the place where the most interesting design questions begin.

The core divergence is determinism. Objects run in environments that are conditioned to behave predictably. Agents do not. What we have introduced, deliberately or not, is a probabilistic runtime into a world that spent decades assuming determinism as a baseline. The model does not just sit alongside the system. It changes the guarantees the system can make about repeatability, confidence, and control. This change shows up in two distinct ways, and it is worth separating them.

Diego Delso, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

The first is variance. Given the same input, a probabilistic system may produce slightly different but roughly equivalent outputs on each run. This is manageable with familiar techniques: retries, consensus, majority voting, or simply accepting that minor variation in phrasing or structure is not a defect. Most teams working with language models learn to tolerate variance quickly, because it is visible and bounded.

The second is drift, and it is harder to see. Drift is not random variation between runs. It is a systematic shift in behavior that accumulates across cycles, contexts, or time. Consider an agent that summarizes customer feedback on a weekly cadence. Early on, it categorizes borderline comments as “neutral.” Over several weeks, as context windows shift and prior summaries feed back into the process, its threshold moves. “Neutral” starts to absorb what a human reviewer would call “mildly negative.” No single report looks wrong. The trend only becomes visible in retrospect, when someone compares the arc of the summaries against the arc of the underlying data.

Variance can be tested at a point in time. Drift cannot. It requires observation over time, across runs, with enough structure to distinguish a stable system from one that is slowly walking away from its intended behavior.

This is where the analogy does more than break. It reveals a gap. In deterministic systems, correctness is a property you can assert. A method either returns the expected value or it does not. Tests pass or fail. In probabilistic systems, correctness is closer to a distribution. A single run may be acceptable. The question is whether the system’s behavior, taken as a whole, stays within bounds that the designer intended.

That changes what “working” means. It is no longer sufficient to verify that an agent produces a good result. You have to verify that it continues to produce good results, and that the definition of “good” has not quietly shifted underneath you. Evaluation becomes a continuous practice, not a gate you pass through once before deployment.

In a deterministic system, unexpected behavior tends to surface as an error. In a probabilistic one, it surfaces as something merely plausible. The system keeps running. Output stays coherent. The failure mode is not a crash but a slow divergence from intent, which is a more difficult thing to catch precisely because it never announces itself.

The past few years of AI development can be read as a collective re-orientation around this reality. Retrieval-augmented generation, tool use, evaluation frameworks, and agent orchestration are all attempts to reintroduce structure and recover confidence in the presence of probabilistic execution.

Object-oriented thinking still provides a strong foundation here. Identity, boundaries, state management, encapsulation, and composition remain essential. But they are not sufficient on their own. What probabilistic systems require on top of that foundation is evaluation as a first-class architectural concern, observability that tracks behavior over time rather than at a single point, and calibration practices that can detect when an agent’s outputs have drifted from the intent encoded in its design.

The instincts that OOP built—draw boundaries, hide complexity, compose from small parts—still apply. The new instinct is that you also have to watch what happens after you ship, with more patience and more structure than deterministic systems ever required.

Why This Matters

The way we think about agents shapes the systems we end up building. If agents are framed primarily as coworkers or substitutes for human judgment, discussion tends to gravitate toward trust, autonomy, and replacement. Those questions are not unimportant, but they are downstream of more basic design decisions.

Framing agents as components, or objects with identity, state, and constrained responsibilities, keeps focus on the system design. It encourages explicit interfaces, clear ownership of decisions, and architectures that can be inspected when behavior diverges from intent.

Seen in the context of probabilistic runtimes, this framing also helps explain why so much recent work has focused on structure. Tool use, retrieval, evaluation, and orchestration are not bolt-ons. They are attempts to recover legibility and confidence in systems whose execution is no longer strictly deterministic.

None of this diminishes what agents can do. It simply grounds that capability in architectural choices we still have to make, and still have to live with.

AI agents are not the end of software architecture. They look more like a continuation of it, consisting of objects with opinions, operating inside runtimes that behave differently than the ones we grew up with. Learning to design for that difference is the opportunity ahead.

Header image – English: NPS Photo, Public domain, via Wikimedia Commons