When Event Sourcing is Ignored: The Case of Chile's Open Finance Directory

There’s a common piece of wisdom in software architecture circles that goes something like this: “Event sourcing is too complicated for 95% of applications.” Derek Comartin from CodeOpinion (one of my faves!) puts it well in that video: it depends of the domain and the concrete use cases, and it has nothing to do with scale. Greg Young himself (living legend), the person who coined the term, has said similar things. The general advice is: most of the time, CRUD is fine. Don’t reach for event sourcing unless you really need it.

I agree. Mostly.

But here’s the thing nobody talks about: the pattern doesn’t just have the risk to get overused, it also gets under-used. There are domains where event sourcing is such a natural fit that not using it actually makes the system harder to build. Domains where the data is inherently a sequence of facts. Where consumers need to stay in sync with a source of truth. Where auditability isn’t a nice-to-have but a regulatory requirement.

I recently came across one of those domains while reading through the technical specification for Chile’s new open finance system (yeah, because I’m a nerd on pat-leave that’s closely following the interesting developments in the Fintech space in my home country), and it’s a textbook example of a missed opportunity.

A Trust Store That Doesn’t Know It’s an Event Log

Chile is implementing its Sistema de Finanzas Abiertas (SFA), the Chilean equivalent of PSD2 and Open Banking UK. The regulation (NCG 514) mandates that banks and fintechs exchange financial data through standardized APIs, secured with FAPI 2.0 and mutual TLS: It’s a serious piece of financial infrastructure.

At the center of this system sits the Participant Directory: a registry maintained by the CMF (Chile’s financial regulator) that stores every participant’s identity, role, digital certificates, public keys, API endpoints, and operational status. Every institution in the ecosystem needs to know who else is in it and whether they can be trusted.

Now, think about what this Directory is from a domain perspective. It’s a thing that states facts: “Participant X joined on this date.” “Participant Y rotated their certificate.” “Participant Z was suspended by the regulator.” “Participant W activated its alternative mechanism due to an incident.” These aren’t CRUD operations on a row. They’re events. Each one is a meaningful thing that happened, with a timestamp, a cause, and consequences for every other participant in the system.

The spec even defines them as such. There’s a taxonomy of event types with names like cl.sfa.participant.change.cert, cl.sfa.participant.left, cl.sfa.participant.cs.inactive. They use a CloudEvents-like envelope format with specversion, type, source, and time fields. They thought in events. They named them. They structured them.

And then they built CRUD with notifications on top.

The “Download The World” Sync Model

The Anexo Técnico N°3 of NCG 514, published for public consultation just this January (2026) defines how participants keep their local copy of the Directory in sync. It works like this:

The push: When something changes, the Directory sends a webhook to every participant’s /notifyupdate endpoint. The payload looks like this:

1
{
2
  "specversion": "1.0",
3
  "type": "cl.sfa.participant.change.cert",
4
  "source": "directorio",
5
  "subject": "Certificate change",
6
  "id": "xkjskk3984jcka",
7
  "time": "2024-08-06T17:31:00Z",
8
  "datacontenttype": "application/json",
9
  "data": {
10
    "participantId": "ID"
11
  }
12
}

The pull: Every participant must also poll a /last-update endpoint at least every 8 hours. If the timestamp doesn’t match their local copy, they download the full participant list via GET /participants.

Notice what happens after the webhook arrives. The event tells you the type of change and who was affected. But it doesn’t include what changed. To get the actual data, the participant must download the entire Directory (every participant, every certificate, every endpoint) and diff it against their local copy to figure out what’s different.

The spec’s own sequence diagrams (Figures 1 through 3, pages 26-28) make this explicit. Every single change triggers the same flow: receive notification → download everything → figure out the diff → update local copy.

It’s like getting a text that says “something in your house changed” and having to walk through every room to figure out that someone moved a mug.

Martin Fowler Called This One

If you’ve read Fowler’s piece on event-driven architecture, (or watch the video to enjoy his posh English accent) you’ll spot this immediately. He distinguishes four patterns: Event Notification, Event-Carried State Transfer, Event Sourcing, and CQRS. The Directory is textbook Event Notification, the most basic, least powerful form of an event-driven system: “Something happened, go figure out the details yourself.”

Event Notification is the pattern you end up with when you’re kind of thinking in events but your underlying mental model is still a mutable table of current state. You have a database of participants. When a row changes, you ping everyone. They re-read the table. Rinse, repeat. It’s a database trigger with an HTTP facelift.

The problem isn’t that it doesn’t work. It works fine (at first). The problem is what it costs everyone downstream.

The Hidden Tax on Every Participant

Let’s count the things that every single institution in Chile’s open finance ecosystem now has to build:

A webhook receiver. Each participant must expose a publicly reachable HTTPS endpoint that accepts POST requests, validates signatures, handles retries, ensures idempotency, and stays highly available. If you’ve ever built a webhook consumer in production, you know this is easily a week of engineering for a robust implementation. And if your receiver is down when the webhook fires? Your only fallback is the 8-hour polling cycle. Eight hours of potentially stale trust data in a system where a revoked certificate means “this entity should not be trusted.”

A full-download client. After every notification, fetch the complete Directory. Parse it. Store it.

A diffing algorithm. Compare the fresh download against your local copy to identify what actually changed. This sounds trivial until you actually build it. Trust me, I speak from experience. One of the worst projects I worked on was in a system that had to sync medical records across different medical institutions: it was a polling and diffing nightmare. The overflow in I/O we generated rendered our databases useless before we managed to implement the fourth medical institution. Long story short: comparing two snapshots of a complex data structure, handling edge cases around ordering, nested objects, fields that can be null vs. absent and all that wasted I/O and effort just to detect a few changes. That’s the kind of thing that takes a day to write and a month to debug. Now multiply that effort by every bank, every fintech, every insurance company in the ecosystem.

Reconciliation logic. What happens when your diff produces unexpected results? When you detect a change that doesn’t match the webhook type you received? When you missed a webhook and the polling cycle reveals multiple changes at once?

Each participant builds all of this independently. Each one makes slightly different assumptions. Each one has slightly different bugs. It’s the kind of distributed inconsistency that doesn’t bite you during testing but surfaces at 2 AM on a Friday when Bank X’s local copy says Fintech Y is active and Bank Z’s says it’s suspended.

The Design That Was Begging to Exist

Here’s what frustrates me about this. The spec almost got there. They defined the event types. They defined the CloudEvents structure. They mandated local copies. They just stopped one step short.

What if, instead of “something changed, go download everything,” the Directory exposed an append-only log of events?

1
GET /events?since=1042&limit=50

1
{
2
  "events": [
3
    {
4
      "sequence": 1043,
5
      "specversion": "1.0",
6
      "type": "cl.sfa.participant.change.cert",
7
      "source": "directorio",
8
      "time": "2026-03-15T14:22:00Z",
9
      "data": {
10
        "participantId": "abc123",
11
        "cert_ca": "DigiCert Global Root G2",
12
        "cert_val": "2027-03-15T23:59:59.999Z",
13
        "x5t": "new_thumbprint"
14
      }
15
    },
16
    {
17
      "sequence": 1044,
18
      "specversion": "1.0",
19
      "type": "cl.sfa.participant.cs.inactive",
20
      "source": "directorio",
21
      "time": "2026-03-15T15:01:00Z",
22
      "data": {
23
        "participantId": "def456",
24
        "previousState": "ACTIVO",
25
        "newState": "INACTIVO"
26
      }
27
    }
28
  ],
29
  "lastSequence": 1044,
30
  "hasMore": false
31
}

Each participant stores one number: the last sequence it processed. To sync: GET /events?since=1042. Apply the deltas. Done.

No full downloads. No diffing algorithm. No reconciliation. The event is the change, and it carries everything the consumer needs to update its local state. This is what Fowler calls Event-Carried State Transfer: the event carries the data, so the consumer never needs to call back to the source. It’s so simple that you wonder where is the catch.

What Changes Concretely

The webhook becomes a hint, not a lifeline. It can still exist as a latency optimization. “Hey, there are new events, go check.” But if it gets lost (and webhooks will get lost), the next poll picks up from the last sequence number. You could poll every 30 seconds instead of every 8 hours, because GET /events?since=1042 when there’s nothing new is essentially free: the server checks one integer.

Downtime recovery is just… syncing. When one of the participants comes back after an outage, it resumes from their last sequence number. They get every event that was persisted, in order, including the one about participant X being suspended during the outage. No stale copies. No manual checks on a web portal. The regular sync mechanism is the recovery mechanism.

Audit trails come for free. When did that certificate rotate? When was that participant suspended? Just query the events. In a regulated financial system where the CMF can suspend participants for compliance violations, having an immutable, queryable record of every trust-relevant change isn’t a feature: it’s a necessity that will eventually be demanded.

New participants bootstrap the same way they sync. A new institution joining the ecosystem replays the event stream (or loads a snapshot plus recent events) and arrives at current state. Same code path as daily synchronization. Not a separate “initial load” mechanism.

Every participant writes less code. No webhook endpoint to maintain. No diff logic. A sequential event consumer is about the simplest distributed systems primitive there is. Read events in order, apply to local state, save your position. Is so simple and elegant that ends up being boring.

The Trade-offs (Because There Are Always Trade-offs)

I’m not going to pretend this is all free. Event-sourced systems have real costs:

Storage. An append-only log grows forever. You need a snapshotting strategy — periodically persist the full state and allow old events to be archived. This is well-understood (Kafka log compaction, Eventstore scavenging), but it’s more operational work than a simple table.

Schema evolution. Once you publish an event schema, changing it is harder than changing a REST response. You need versioning. Consumers need to handle old and new formats during transitions. The specversion field is already in the CloudEvents envelope, but using it well requires discipline.

Idempotent consumers. If a participant crashes between processing an event and saving its position, it will replay events on recovery. Every consumer must handle duplicates gracefully. This is standard practice but it’s a requirement that doesn’t exist in the “just download everything” model.

Ordering. Sequence numbers must be strictly monotonic and gap-free. The event log is a serialization point. For a single-writer system administered by the CMF, this is trivial. But it’s worth calling out.

None of these are novel problems. They’re the same trade-offs every event-sourced system navigates. And they’re considerably simpler than building and maintaining N independent diffing algorithms across an entire financial ecosystem.

The Point

This isn’t theoretical. At Lendable, where I work, we’ve built our core banking products on event sourcing: credit cards, loans, the works. Exposing event feeds for external consumers to sync against is something we run in production daily. The pattern works. It scales. And it’s dramatically simpler for consumers than the alternative.

The conventional wisdom says event sourcing is too complex for most applications. And that’s true (for most applications). But the flip side of that advice is that sometimes, by reflexively reaching for CRUD and bolting notifications on top, we end up building something that’s more complex than the event-sourced alternative would have been.

A participant directory for a national open finance system is a trust store. Its entire job is to record facts (who joined, who left, who rotated their certificates, who got suspended) and make sure every participant in the ecosystem has an up-to-date, consistent view of those facts. That’s not a table that gets updated. That’s a log of events that gets appended to. The domain is the event stream.

Sometimes the reason event sourcing “feels too complicated” is that we’re trying to retrofit it onto a domain that doesn’t need it. But sometimes, the reason CRUD “feels simple” is that we’ve pushed all the complexity downstream: into webhook receivers, diffing algorithms, and reconciliation logic that every consumer has to build independently.

Apparent simplicity in the spec, but actual complexity in the implementation: that’s not a good trade.

I’m kinda sad now I missed the consultation phase for this. But the lesson is here: look at your domain, understand it fully, and don’t be afraid to reach for event sourcing.