Redux for Enterprise Context
Everyone is talking about context layers for data agents. a16z published a piece last week arguing that agents can’t answer “what was revenue growth last quarter?” without one, and Andy Chen at Abnormal Security built one with 20 parallel agents and a GitHub repo. The enterprise AI discourse has collectively discovered that agents need context to function, and they’re right, though I think they’re framing it too narrowly.
The context problem isn’t specific to data agents. Any agent that takes actions inside an organisation needs to understand how the company actually works: which systems are deprecated, which policies are current, which decisions got reversed, which team owns what. A coding agent that doesn’t know the team decided to move off a system won’t see any issue with trying to integrate it. A support agent that doesn’t know certain questions should be escalated will confidently answer them wrong. A procurement agent that doesn’t know a vendor relationship is under legal review may confidently renew the contract. The failure mode for a data agent without context is a bad bar chart; the failure mode for an acting agent without context is a bad decision, and bad decisions are much harder to undo than bad bar charts.
The standard solution to retrieving context is RAG: generate embeddings for your documents, retrieve the relevant ones, and let the LLM figure it out. The trouble is that enterprise document stores are graveyards of outdated information, where wiki pages from 2019 sit next to Slack threads from yesterday, and the Q3 architecture decision contradicts the Q1 architecture decision, but both are retrieved by a semantic search query. A policy that was rewritten six months ago may still have the old version indexed right alongside the new one, and RAG has no mechanism to know which version is current. It retrieves what’s semantically similar and hopes the generation model can sort out the temporal mess, which it can’t, because the generation model often isn’t provided with temporal context either.
Fundamentally, the bottleneck in enterprise AI isn’t retrieval or generation; it’s corpus integrity.
Prior art
The a16z piece nails the diagnosis and lays out the market landscape well. What it leaves open is the architectural question: how should a context layer actually work under the hood? They describe the problem space and identify three categories of vendors who might solve it, which is useful framing, though it doesn’t get into the engineering.
Andy Chen went further. He pointed 20 parallel agents at every internal data source at Abnormal Security (Slack, Jira, Gong, the codebase itself) and let them build an “Enterprise Context Layer” in a GitHub repo. Two days later he had 6,000 commits, 1,020 files across 11 domains. The agents produced things no human team could have assembled (at least not at any reasonable price): battle cards where every competitive claim was backed by a specific Gong recording cross-referenced against actual product capabilities and tied to the Salesforce case showing how the deal ended, and a complete customer journey map with failure modes at every handoff between teams. Genuinely impressive work.
There’s a catch, though. Andy’s agents sweep and reconcile: they pull from git, pick up tasks, update files, push, repeat. The system is eventually consistent, emphasis on “eventually.” When Agent 7 updates the auth documentation while Agent 12 is rewriting the same section based on different sources, the result depends on who pushes last. Andy himself admits it probably doesn’t work at Microsoft’s scale, and I think the reason is architectural rather than just a matter of throwing more agents at it.
Does the right pattern already exist?
There’s an architectural pattern from a completely different domain that I think maps onto this problem remarkably well. It comes from two places, actually: event sourcing in backend systems, and Redux in frontend development. The two share the same core idea, and the combined mental model translates to enterprise context almost embarrassingly directly.
Event sourcing is a pattern where you store every change to a system as an immutable event in an ordered log, and derive the current state by replaying those events through a reducer function. It’s been used in financial systems, distributed databases, and anywhere you need an authoritative history of how state evolved over time. Redux brought the same idea to frontend development around 2015: a single store, an ordered stream of actions, and a pure reducer that takes the current state plus an action and produces the next state. If you’ve built a React application, you know the pain it solved: seventeen components each holding their own version of the truth, two of them disagreeing about whether the user is logged in, and a bug report that says “sometimes the cart is empty.” Redux created one source of truth, with conflicts resolved at write time, and every consumer reading from the same clean projection.
Let me walk you through how the pattern maps to enterprise context.
Documents are events. Every document that enters the system (a wiki edit, a Slack thread, a policy update, an architecture decision record) is appended to an ordered log, timestamped by when it was last modified. Last-modified is the right timestamp because it reflects when the organisation last considered this information current, regardless of when the document was originally created. The log is append-only: you never delete or modify past events, you only add new ones that supersede them, exactly as in traditional event sourcing.
The LLM is the reducer. When a new event arrives, hybrid search finds the relevant parts of the most recent projection (a cloud migration decision might touch projections on infrastructure, cost, compliance, and deployment simultaneously). For each affected part of the projection, the system loads it alongside the new event, and the LLM compares claims, resolves conflicts, updates what’s changed, preserves what hasn’t, and produces an updated projection. No schema required, because the LLM reasons over unstructured text natively, which is the entire point of using one here rather than a traditional reducer function.
The projected knowledge base is the state: for example, a set of markdown files in a git repo representing the organisation’s best current understanding of every domain, temporally resolved and internally consistent. It can be as structured as a folder hierarchy or as simple as a bag of documents; the only requirement is that hybrid search can find the relevant parts of the projection when a new event needs to be reduced. This is what keeps each reduction tractable: the reducer never needs the full knowledge base in context, just the slices that the incoming event might affect.
The critical property is the same one that made Redux work: conflicts are resolved at write time, not read time. Standard RAG surfaces three documents that disagree and hopes the generation model sorts it out at query time. Event-sourced projection resolves the contradiction when the newer document enters the system, so by the time any agent queries the corpus, the answer is already clean.
The knowledge graph is just citations
Every claim in the projected corpus carries inline citations back to the source events that produced it, and the system depends on these citations to function.
Those citations are the knowledge graph. A claim in the auth service documentation might cite the original architecture RFC, a later migration document that updated the approach, and a Slack thread where the tech lead clarified an edge case. There’s no separate graph database, no ontology language, no taxonomy to maintain, just hyperlinks. The graph emerges organically from the citations the reducer writes during each merge, which means it grows richer with every event the system processes.
The citations also give you things that are surprisingly hard to get any other way. Confidence becomes observable: a claim backed by five independent sources is stronger than one sourced from a single Slack message, and you can compute this directly from the citation metadata. Staleness becomes detectable: if every citation on a claim is six months old and no new events have touched that projection, something is probably worth reviewing.
Observability on organisational knowledge
Here’s where this diverges from both the a16z vision and Andy’s implementation, and where it starts to get interesting beyond just improving RAG quality. The reducer doesn’t have to resolve every conflict. When two sources genuinely disagree and the right answer isn’t clear, the reducer can write the tension directly into the projection: “Source A claims X (cited), Source B claims Y (cited), unresolved.”
This means the system doesn’t just produce clean knowledge; it also surfaces where your organisation’s knowledge is broken. Every unresolved tension in the projection is effectively a signal that some internal process has a gap, whether that’s two teams operating on different assumptions, a policy that was updated without telling everyone, or a decision that was reversed but never documented. In the same way that application observability tools like Datadog show you where your systems are unhealthy, the projected knowledge base shows you where your organisation’s understanding of itself is inconsistent, contested, or stale.
Raw RAG hides these conflicts entirely and lets the generation model pick a winner based on retrieval ranking, which is essentially random with respect to correctness. Andy’s agents try to resolve everything, documenting conflicts only as a fallback. Event-sourced projection treats unresolved tension as a first-class state, because sometimes the organisation genuinely doesn’t know the answer yet, and the system should reflect that honestly rather than fabricate certainty. The tensions become a dashboard of organisational misalignment, which is valuable information in its own right, regardless of whether you’re building agents on top of it.
Downstream agents handle flagged tensions gracefully: an agent that reads “this is unresolved, route to the security team” is vastly more useful than one that confidently acts on whichever document happened to rank highest. When a human does resolve the tension, they submit a correction event that flows through the same reducer, no special mechanism required. Corrections are just events.
Self-correction
The obvious objection to this architecture is error propagation: what if the reducer makes a bad call early on, and that bad call compounds through subsequent reductions?
The input stream has a naturally converging property: organisations keep producing documents that refine, clarify, and update their own knowledge. If the reducer misinterprets something in March, a clearer document in April that addresses the same topic triggers a new reduction that corrects it. The projection should converge towards correctness over time, not because the system is clever, but because the data it ingests is itself converging.
The actual failure mode worth worrying about is error persistence in quiet corners: a wrong reduction in a rarely-updated domain could sit there for months because nothing new arrives to correct it. This is where citation staleness detection earns its keep. If a projection hasn’t seen a new event in a while, flag it for human review; the human can then either confirm the projection is still accurate or submit a correction event, which flows through the exact same mechanism as everything else.
Getting started
You can bootstrap this by replaying the full event log (all existing documents in chronological order) through the reducer, which is compute-intensive but a one-time cost, and directly analogous to rebuilding a materialised view from an event store. Each reduction is bounded to one event and the relevant parts of the existing projection, which will easily be within context window limits regardless of total corpus size.
After bootstrapping, updates are incremental: a new document comes in, search finds the affected projections, the reducer updates them, done. High-velocity low-signal sources like Slack chatter can be batched into periodic synthetic events, while high-signal sources like policy documents and architecture decisions process individually. The pattern works either way.
The output is just files. Markdown in folders, with citations, in a git repo. Index it however you want. The point is that whatever retrieval method your agents use, they’re hitting a corpus that’s already been through temporal reconciliation, where every claim is cited and every conflict is either resolved or explicitly flagged. The hard problem is solved before the query ever happens.
That’s event sourcing and Redux applied to enterprise context: an append-only log of documents, a reducer that materialises clean state, and a single projected knowledge base that every agent in the organisation reads from. I’m building this, and if it’s a problem space you’re thinking about too, I’d love to compare notes.

