Domain Orchestrator - The smart consumer

Series

This post is part of the Domain Orchestrator series:

The existing architecture

The architecture we had been working with for the last few years was fundamentally quite simple. As a Risk Segment, we had a bunch of microservices, each with a specific purpose (checking credit scores for a company, checking for compliance etc.). When an order was placed, our primary consumer coordinated several risk checks directly until it was happy to either accept or reject the order attempt.

Already, you could argue there are some issues. Our consumer had accumulated domain-specific integration logic. They have to integrate with n services, where each step in n adds complexity.

graph TD;
    subgraph Consumer Domain
        P[Consumer Service]
    end

    subgraph Risk Domain
        F[Fraud Service]
        L[Limit Service]
        C[Credit Service]
        S[Compliance Service]
    end

    P --> F;
    P --> L;
    P --> C;
    P --> S;

Changing with the business

As the business grew, we onboarded new partners, attracted new customers, and over time, our risk assessment requirements grew. We had to evolve with the market, however, we found ourselves with an issue; change was tough.

“Simple changes”

Let’s suppose a new partner sent us a new data point with each attempted order - let’s say customer_shoe_size. This is an extremely important piece of information for us, and we need it to help evaluate risk in 4 of our risk services. What are the implications of this seemingly simple change?

multiple contract updates for our services
multiple integration updates for our primary consumer
multiple places along the critical path where a change can cause failure
A dependency on another team with their own priorities

If these kind of changes are required once or twice, you can swallow it. However, the more it happens, the more frustrated everyone gets.

More complicated changes

On the other side of the coin, let’s say for a given customer, we wanted bespoke functionality. Perhaps we want to enable an experiment, or we want to support richer or dynamic interactions between checks, or we want to adjust the sequencing of checks. Suddenly, this becomes a major initiative. But the burden of change doesn’t fall on the Risk team; it falls entirely on the consumer. For them, this is a large, complex change that brings them no direct value and doesn’t contribute to their team goals.

All in all, regardless of the size of the change, it was becoming too complex to manage in the current setup; the smart consumer setup.

The proposed model

The idea we had been discussing for a number of years now was introducing an “Orchestrator”, which would become the single, unified entry point for our primary consumer to interact with. Behind the orchestrator, we (in the Risk Domain) would have full authority on exactly what happens, and we can be largely in control of changes going forward, so long as we receive all of the information we need up front.

graph TD;
    subgraph Consumer Domain
        P[Consumer Service]
    end

    subgraph Risk Domain
        O[Risk Orchestrator]
        F[Fraud Service]
        L[Limit Service]
        C[Credit Service]
        S[Compliance Service]
    end

    P --> O;
    O --> F;
    O --> L;
    O --> C;
    O --> S;

    style O fill:#9F9,stroke:#333,stroke-width:4px

On paper, this looks great, but a lot of things to do in isolation. When we consider the full picture, well, a lot of things begin to reveal themselves, and then it becomes a question of “is it worth the investment?”.

❌ Cons

Let’s start with the cons, downsides, or dangers of this approach:

More services, more problems: throwing more services at a problem inherently comes with complexities and should always be well considered; infrastructure, boundary definition, data persistence etc.
Introducing a Single Point of Failure: the old structure was (in a weird way) more resilient. In the previous model, individual services operated independently from one another. Conversely, in the new model, the orchestrator becomes a central dependency that must meet strict reliability expectations.
Increased latency: there’s no escaping it, there’s now an extra network hop in every single check. Going through this “proxy” of the orchestrator will always be slower.
God service fear: this was, and is still a huge concern. We are introducing a single service that risks fronting the entire “risk segment” - it could grow in all sorts of ways.
- it can evolve into a monolothic god service
- it can become a dumping ground for any risk capabilities that we’re not quite sure where to put them.
- it can grow too knowledgeable about all of it’s downstream services, and it can then become the new barrier to change
Migration cost: we’re migrating established behaviour on a sensitive, high-value path. Along with development costs, there are a host of fail safe measures and entire documents dedicated to the release strategy alone which comes at a very high cost.

✅ Pros

Honestly, whilst writing that list of cons, I began questioning if we made the right decision by taking this route! Let’s take a dive into the pros of this approach:

Increased Team Autonomy: this was a primary driver. By owning the orchestration, our team can now add, remove, or re-order risk checks independently. A change that used to require a multi-week, cross-team project can now be developed, tested, and deployed by our team as fast as we can work.
Simplified Consumer: the smart consumer isn’t so smart anymore - it doesn’t need to know anything about internal risk logic, and it can replace the n integrations to our services, with a single integration to the orchestrator. Their implementation is simpler, and our API contract is a single, clean, stable boundary between the 2 systems.
Clear Ownership: we established a clear “front door” to the Risk Domain. Internally, our team has full ownership of the logic. Externally, other teams know exactly who to talk to. This single point of ownership is a massive boost for clarity and accountability.
Centralised Observability: before, our monitoring was distributed across services, requiring stitching together multiple data sources to trace the journey of a single order from start to finish. The orchestrator gives us a single source of truth. We can now build powerful dashboards that show the entire assessment funnel, the result of every check, and the performance of each dependency, all in one place.

Conclusion

The “cons” were significant and real, but the “pros” represented a necessary shift. They weren’t just about code; they were about giving our team the autonomy to build, experiment, and move as fast as we can. Despite the risks, in our minds, the investment was worth it.

In the next post, we’ll dive into the blueprint for the orchestrator itself and how we approached it using Domain-Driven Design and Ports & Adapters, and the key technical decisions we made to reduce some of the risks noted above.