Enterprise Architecture Reviews: What We Find in the First Two Weeks
After conducting architecture reviews across dozens of enterprises, certain patterns emerge almost universally. Understanding these common failure modes is the first step to building systems that scale.
In two weeks of architecture review work, we can typically identify 80% of the systemic risks in an enterprise's technical estate. Not because enterprise architecture is simple — it is not — but because the failure modes are remarkably consistent across industries, technology choices, and organisational structures.
The most common finding is what we call 'distributed monolith syndrome.' Organisations invest heavily in microservices or service-oriented architecture to gain deployment independence and scalability, but retain synchronous, tightly-coupled dependencies between services that eliminate both benefits. You get all the operational complexity of distributed systems with none of the flexibility. The tell-tale sign is a deployment that requires coordinating releases across six services simultaneously.
Database coupling is the second most common issue. Services that are supposed to be independently deployable share a database, meaning a schema change in one service requires coordinating changes across multiple teams. This is not an architecture problem per se — it is a governance problem that was never addressed at the architecture level. The solution is almost never to split the database immediately; it is to establish bounded context ownership and plan an incremental migration.
Security architecture is where we most often find critical gaps. Not in perimeter security — most organisations have invested heavily there — but in east-west traffic within the application tier. Services communicating internally with no authentication, secrets stored as environment variables rather than in a secrets manager, and no network segmentation between production tiers are common findings with serious risk profiles.
Observability gaps are universal. Most organisations can tell you that something is wrong in production. Far fewer can tell you precisely where, why, and what the user impact is within the first five minutes of an incident. The difference between these two states is instrumentation: distributed tracing, structured logging with correlation IDs, and service-level objectives with automated alerting.
Our architecture review process produces a risk-ranked findings register with remediation effort estimates, a current-state architecture diagram (often the first one that accurately reflects production), and a target-state roadmap. The goal is not to redesign your architecture — it is to give your teams a clear picture of where to focus their next six months of investment.