Domain-Driven Design - Tackling Complexity in the Heart of Software
A collection of notes I took while reading Domain-Driven Design - Tackling Complexity in the Heart of Software by Eric Evans.
The use of language
Our brains are somewhat specialized for dealing with complexity in spoken language. A great example: when people of different language backgrounds come together for commerce, if they don't have a common language they invent one, called a pidgin. The pidgin provides a much more basic means of communication than the speakers' original languages, but it's well suited to the task at hand.
When people are talking, they naturally discover differences in interpretation and the meaning of their words, and they naturally resolve those differences. They find rough spots in the language and smooth them out.
Domain-Driven Design helps us develop a language that is easy to understand for all stakeholders.
One model for everyone
Technical people often feel the need to "shield" the business experts from the domain model. If sophisticated domain experts don't understand the model, there is something wrong with the model. 💥
Layered software architecture
There are all sorts of ways a software system might be divided, but through experience and convention, the industry has converged on LAYERED ARCHITECTURES, and specifically a few fairly standard layers.
- User Interface (or Presentation Layer)
- Application Layer
- Domain Layer (or Model Layer)
- Infrastructure Layer
Develop a design within each layer that is cohesive and that depends only on the layers below.
Concentrate all the code related to the domain model in one layer and isolate it from the user interface, application, and infrastructure code.
An object defined primarily by its identity is called an ENTITY. Identity is not intrinsic to a thing in the world; it is a meaning superimposed because it is useful. An identifying attribute must be guaranteed to be unique within the system however that system is defined.
Why object identity matters? An object must be distinguished from other objects even though they might have the same attributes. Mistaken identity can lead to data corruption.
Many objects have no conceptual identity. These objects describe some characteristic of a thing. An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT. VALUE OBJECTS are instantiated to represent elements of the design that we care about only for what they are, not who or which they are.
VALUE OBJECTS are used as attributes of ENTITIES.
Basic distinction between an Entity and a Value Object
An ENTITY represents something with continuity and identity, something that is tracked through different states or even across different implementations. A VALUE OBJECT is an attribute that describes the state of something else.
Immutability is a great simplifier in an implementation, making sharing and reference passing safe. If the value of an attribute changes, you use a different VALUE OBJECT, rather than modifying the existing one.
There are important domain operations that can't find a natural home in an ENTITY or VALUE OBJECT. Some of these are intrinsically activities or actions, not things, but since our modeling paradigm is objects, we try to fit them into objects anyway. When we force an operation into an object that doesn't fit the object's definition, the object loses its conceptual clarity and becomes hard to understand or refactor.
A SERVICE is an operation offered as an interface that stands alone in the model, without encapsulating state, as ENTITIES and VALUE OBJECTS do.
Three characteristics of a good SERVICE:
- The operation relates to a domain concept that is not a natural part of an ENTITY or VALUE OBJECT.
- The interface is defined in terms of other elements of the domain model.
- The operation is stateless.
Domain and application Services collaborate with infrastructure Services
Example: A bank might have an application that sends an e-mail to a customer when an account balance falls below a specific threshold. The rules that describe when that happens live in a domain SERVICE. The interface that encapsulates the e-mail system, and perhaps alternate means of notification, is a SERVICE in the infrastructure layer.
Modules (a.k.a. Packages)
It is a truism that there should be low coupling between MODULES and high cohesion within them. It isn't just code being divided into MODULES, but concepts. There is a limit to how many things a person can think about at once (hence low coupling). Incoherent fragments of ideas are as hard to understand as an undifferentiated soup of ideas (hence high cohesion).
Why the Object Paradigm Predominates
Eric Evans is of the opinion that Domain-Driven Design fits best with OOP.
Many of the reasons teams choose the object paradigm are not technical, or even intrinsic to objects. But right out of the gate, object modeling does strike a nice balance of simplicity and sophistication.
Other interesting modeling paradigms just don't have this maturity. Some are too hard to master and will never be used outside small specialties. Others have potential, but the technical infrastructure is still patchy or shaky. These may come of age, but they are not ready for most projects.
A number of objects can be encapuslated within an AGGREGATE, which helps to limit the explosion of relationships and deep paths through object references. Each AGGREGATE has a root and a boundary. The boundary defines what is inside the AGGREGATE. The root is a single, specific ENTITY contained in the AGGREGATE.
Invariants, which are consistency rules that must be maintained whenever data changes, will involve relationships between members of the AGGREGATE. The root ENTITY has global identity and is ultimately responsible for checking invariants.
Root ENTITIES have global identity. ENTITIES inside the boundary have local identity, unique only within the AGGREGATE. Nothing outside the AGGREGATE boundary can hold a reference to anything inside, except to the root ENTITY.
The two basic requirements for any good FACTORY are:
- Each creation method is atomic and enforces all invariants of the created object or AGGREGATE. A FACTORY should only be able to produce an object in a consistent state. For an ENTITY, this means the creation of the entire AGGREGATE, with all invariants satisfied, but probably with optional elements still to be added. For an immutable VALUE OBJECT, this means that all attributes are initialized to their correct final state.
- The FACTORY should be abstracted to the type desired, rather than the concrete class(es) created.
More information on the factory pattern.
Domain logic relies on queries and client code, and the ENTITIES and VALUE OBJECTS become mere data containers. The sheer technical complexity of database access infrastructure quickly swamps the client code.
A REPOSITORY lifts a huge burden from the client by providing a simple, intention-revealing interface, so that the client can "ask" for what it needs in terms of the model. The interface is simple and conceptually connected to the domain model.
The truth about refactoring
The returns from refactoring are not linear. Usually there is a marginal return for a small effort, and the small improvements add up. They fight entropy, and they are the frontline protection against a fossilized legacy. But some of the most important insights come abruptly and send a shock through the project.
A SPECIFICATION states a constraint on the state of another object, which may or may not be present. It has multiple uses, but one that conveys the most basic concept is that a SPECIFICATION can test any object to see if it satisfies the specified criteria.
We might need to specify the state of an object for one or more of these three purposes.
- To validate an object to see if it fulfills some need or is ready for some purpose
- To select an object from a collection (as in the case of querying for overdue invoices)
- To specify the creation of a new object to fit some need
The specification pattern allows the business rules to be chained and recombined using boolean logic.
Operations can be broadly divided into two categories, commands and queries. Queries obtain information from the system, possibly by simply accessing data in a variable, possibly performing a calculation based on that data. Commands (also known as modifiers) are operations that affect some change to the systems (for a simple example, by setting a variable). In standard English, the term side effect implies an unintended consequence, but in computer science, it means any effect on the state of the system.
Place as much of the logic of the program as possible into functions, operations that return results with no observable side effects. Strictly segregate commands (methods that result in modifications to observable state) into very simple operations that do not return domain information. Further control side effects by moving complex logic into VALUE OBJECTS when a concept fitting the responsibility presents itself.
Generating a running program from a declaration of model properties is a kind of Holy Grail of Domain-Driven Design, but it does have its pitfalls in practice:
- A declaration language is often not expressive enough to do everything needed, at the same time a framework makes it very difficult to extend the software beyond the automated portion
- Code-generation techniques cripple the iterative cycle by merging generated code into handwritten code in a way that makes regeneration very destructive
Strategy (a.k.a. Policy)
Domain models contain processes that are not technically motivated but actually meaningful in the problem domain. When alternative processes must be provided, the complexity of choosing the appropriate process combines with the complexity of the multiple processes themselves, and things get out of hand.
Factor the varying part of a process into a separate "strategy" object in the model. Factor apart a rule and the behavior it governs. Implement the rule or substitutable process following the STRATEGY design pattern. Multiple versions of the strategy object represent different ways the process can be done.
More information on the strategy pattern.
A Design for Developers
Software isn't just for users. It's also for developers. Developers have to integrate code with other parts of the system. In an iterative process, developers change the code again and again.
If you wait until you can make a complete justification for a change, you've waited too long. Your project is already incurring heavy costs, and the postponed changes will be harder to make because the target code will have been more elaborated and more embedded in other code.
Maintaining Model Integrity
Although we seldom think about it explicitly, the most fundamental requirement of a model is that it be internally consistent; that its terms always have the same meaning, and that it contain no contradictory rules.
Group responsibilities related to a specific business requirement in a BOUNDED CONTEXT.
Explicitly define the context within which a model applies. Explicitly set boundaries in terms of team organization, usage within specific parts of the application, and physical manifestations such as code bases and database schemas. Keep the model strictly consistent within these bounds, but don't be distracted or confused by issues outside.
Relationships between Bounded Contexts
- Separate Ways: Integration is always expensive. Sometimes the benefit is small.
- Shared Kernel: Designate some subset of the domain model that the two teams agree to share. Of course this includes, along with this subset of the model, the subset of code or of the database design associated with that part of the model. This explicitly shared stuff has special status, and shouldn't be changed without consultation with the other team.
- Conformist: It is not fun to be a Conformist. 😥
- Anticorruption Layer: Create an isolating layer to provide clients with functionality in terms of their own domain model. The layer talks to the other system through its existing interface, requiring little or no modification to the other system. Internally, the layer translates in both directions as necessary between the two models.
- Open Host Service: Define a protocol that gives access to your subsystem as a set of SERVICES. Open the protocol so that all who need to integrate with you can use it. Enhance and expand the protocol to handle new integration requirements, except when a single team has idiosyncratic needs. Then, use a one-off translator to augment the protocol for that special case so that the shared protocol can stay simple and coherent.
Identify cohesive subdomains that are not the motivation for your project. Factor out generic models of these subdomains and place them in separate MODULES. Leave no trace of your specialties in them. Once they have been separated, give their continuing development lower priority than the CORE DOMAIN, and avoid assigning your core developers to the tasks (because they will gain little domain knowledge from them). Also consider off-the-shelf solutions or published models for these GENERIC SUBDOMAINS.
This was an interesting read. I can't escape the feeling that the book could be made shorter and still be very valueable. The writing style is rather elaborate. The author likes to come up with terms like supple design or ubiquitous language, which I don't see in common use outside of the book. Some of the examples of modeling real world applications were great.