The Great Domain Model Debate – Solved!

In almost every version 1.0 system I design, I end up endlessly rehearsing the pros and cons of different implementations of the domain model (or lack of it). It’s getting so tedious that I recently decided to answer the question to my own satisfaction. I produced a spreadsheet with as many design factors that I could think of and the different models that I have worked with or considered over the years. I attached a weight to each design factor and then I assigned a score to each model for each design issue based on how well I thought it performed. I then summed the weighted scores for each model to produce a score for the whole model. I was glad to see that the Anemic Domain Model won, and was not surprised to see that performance, intelligibility and strong typing won out over cross-platform support and publishability.

I included a few models that I wouldn’t even dream of using, such as typed and untyped associative containers and raw XmlDocuments for the sake of not giving in (too much) to my own bias. As a matter of fact, typed associative containers (i.e. typed wrappers around HashTables) scored better than plain old DataSets or Raw XmlDocuments. These last two I have seen actively promoted by teams of Microsoft architects who ought to know better. Also unsurprisingly, the worst score came from the untyped associative container (i.e. HashTable or TreeSet or whatever). Nevertheless, this model is employed by a disproportionate number of lazy designers who can’t be bothered to design a decent object model. This is a particularly popular solution for configuration systems. It has even been enshrined as the solution of choice in the .NET framework! I guess I should take this chance to reiterate my call for designers and developers to avoid this model at all costs – it has absolutely no redeeming features!!!!

I also included what I have called Transactional DTOs which design was mentioned to me as a serious proposal by Mitch Denny. I have never used this approach, so I may not have done it justice. Even so it scored highly, coming in only behind typed data sets solely because I couldn’t find a way to base a system solely on them. As a result they score lowly on searchability under current and future APIs. If they were paired with an Anemic Domain Model then the system might be very elegant. I hope that readers can write in to report on good examples of this design in use. How about it Mitch? Have you written about this approach in any depth? I have never tried Naked Objects before either, and my understanding was based upon articles that I read a few years ago. Things may have moved on quite a bit. Check out this article for more.

Each cell represents my assessment (from 0 to 1) of the strength of the given implementation technique for each design factor. As I said before, some of these are based on off-the-cuff judgements without as much exposure as tried-and-true idioms. They are to a certain degree subjective, but I think I’ve been fair. The weights I expect to vary from project to project, but experience tells me you should flout them at your peril! The scores in Figure 2 represent how well each idiom scored for each design factor, and the overall scores at the bottom represent the overall quality of the model.

Figure 1. Weights on each design factor, and strength of each model type for that factor.

Figure 2. Scores of each idiom, based on the sum of the weighted scores.

As you can see, the Anemic Domain Model scored best because of its performance, strong typing, good support for relationships, encapsulation, and simplicity. That’s despite the fact that it underperforms on cross platform support, publishability and transactions. If you have specific non-functional requirements, then you might need to adjust the weights on Figure 1 to reassess the model you use. The chart doesn’t take into account other design considerations that might boost the score of certain models. Those include IDE support, non-standard API support, developer familiarity and published reference applications. It also doesn’t try to assess the cost of changing a model on legacy code, which is always likely to be prohibitive. The fact that it’s prohibitive means that you have to get it right first time. It also seems to imply that someone in Microsoft also did the same thing and finally realized that some of their sales pitches were not made in the best interests of their clients! It also explains why they have started making the transition to ORM technologies and Language Integrated Queries (LINQ). The benefits are quite clear when you tabulate them.

Lastly, I wonder whether there are quantitative assessments that can be applied to other semi-religious debates in software engineering. Can you think of any? I’m not going to consider linguistic debates – I’m thinking more design issues such as deciding between different display/navigation patterns or deciding whether to run stored procedures, dynamic queries, CLR stored procedures or whathaveyou. What do you agonize over most? Perhaps there is a systematic way for you to choose in future? Why don’t you choose some idiom that you have had a lot of experience with, and give it the same treatment? I’ll create a link from this page to yours. Perhaps if enough people produced models like this, then we could create a ready-reckoner for choosing designs.

kick it on DotNetKicks.com

About these ads

16 comments

  1. Hi Lars,

    It is regarded as an anti-pattern by people of that religious persuasion. I prefer to disagree. Moreover I wanted to make my own mind up in a systematic way rather than just accept the authority of people like Martin Fowler (for whom I have great respect) without question.

    Forgive me for the overworked metaphor that follows, but just bear with me. Anaemic domain objects are the carriers of meaning in a software system in the same sense that words carry meaning in a sentence. Like words, these objects don’t carry their own context, rather they are embedded in it. In this metaphor the sentence is roughly equivalent to the transaction script (Session Bean, Business Logic etc).

    As such the value of the Anaemic Domain Model (ADM) is that it represents the _vocabulary_ of the business. It is a vital part of the intellectual property of a business, complemented by Business Logic of course, and actuated by the reams of pointless boilerplate code needed to make it all work. I think that, as Fowler points out, the ADM is experiencing a resurgence because of the following things:

    – For enterprise wide reuse, the ADM is more useful if it is not encumbered by business rules from some specific application in the enterprise
    – ADM objects work well with ORM systems
    – Rich Domain Models can’t be regenerated by code generators (well, they can in C# 2.0 using partial classes, but they couldn’t before) without resort to complex inheritance models
    – Rich Domain Models can’t be distributed to third parties (unless they are rendered at least a little anaemic)
    – Business Rules can’t be serialised by SOAP :-)

    I am therefore inclined to regard Rich Domain Models as the anti-pattern for these reasons.

    Andrew

  2. Can you elaborate on the definitions of the different container types above?

    For example, what does transactional DTO mean? In theory, I know what this means, but everyone’s definition of this stuff differs,

    Also, I don’t see ‘xsd based messages” above. These are cross platform, support intellisense, transactional, etc. Big drawbacks are query support, changeset tracking.

    Also, along those same lines, I do not see WCF DataContracts.

  3. Good article. I use a ADM all the time, espeically on top of a basic framework that I use for medium-scale web based applications. I find that the ease of use, readability and scaleability that I get from this model far betters the bloated datasets or even a rich domain model.

  4. Interesting concept: I particularly like the concept of using numeric weights and rankings, rather than just claims of preference. But we need to realize that we can use numbers to express our level of subjective feelings, and that doing so doesn’t make the results objective.

    I notice that, according to your numbers, the top two recommended approaches are: (#1) Anemic Domain Model, and (#2) Rich Domain Model, with all others trailing behind these. The difference between the top two is about 10% of the difference between the top and bottom (best and worst) approaches. So I’d say that even you’re numbers aren’t a strong argument against using RDM; they only show a moderate advantage of ADM, while showing that RDM is still better than anything else (other than ADM).

    About 46% of the difference between ADM and RDM is that you feel that ADM is twice as “encapsulatable.” I think it would be helpful to have a better idea of what you mean by the values, like “encapsulatable.” When I consider how well a class encapsulates its fields, I use the information hiding principles, concluding that an ADM object that exposes all fields with getters and setters is not well encapsulated, while a RDM object that hides its fields and exposes only operations is very well encapsulated.

    For about 23% of the difference between ADM and RDM, you feel that ADM is nearly twice as maintainable. My experience has generally been the opposite: For example, when interpretation and enforcement of codes and states is distributed throughout all the procedural modules that use a record, rather than centralized and encapsulated in the class that represents that record, then ensuring that all actions conform to the rules and changing the rules can be difficult, costly and risky.

    (For completeness, the remaining differences are: technological intrusiveness (15%), complexity of programming model (11%), and ease of extensibility (5%).)

    One last thought about the numbers: I find it interesting that “naked objects” gets a much different and much worse score than either ADM or RDM; only three others scored worse. I find this interesting because the Naked Objects Framework, which you link to in your article (see http://nakedobjects.org/ for more info) is based heavily on the assumption that you will build a Rich Domain Model: With Naked Objects, there are no additional layers of logic between the GUI and the domain model, so unless all operations the user is interested in are in the domain model, the user will not be able to invoke them. And unless the operations are directly on the relevant business domain classes, the system will probably be difficult and confusing to use.

    _ _ _

    Overall, I found this article and in particular your approach interesting and informative. I like the ideas; I just don’t agree with the conclusions.

  5. P.S. Both tables get cut off on the right (in IE7 on a large screen). This is unfortunate, as the tables are quite well done. The coloring of the numbers is nice.

    (As a nitpick suggestion: I might add border lines between the horizontal and vertical titles.)

    Thank you for the thought-provoking posting!
    – jeff

  6. Hi Jeff,

    Your last comment (#7) seems to be some kind of trackback, but I can’t find your blog to see what you wrote…
    …could you reply with your URL?

    Cheers

    Andrew

  7. great article .I have similiar thoughts and ı was trying to establish a best practice with LinQ and vs2008.

    rich domain models are some what unbalanced things of OO programming ,it is in some ways radical approach of programming.I beleive we should stay in the gray zone in stead of something really pure.

  8. Nice study Andrew. Hey sounds like my intuition has served me well too ;)

    I’ve always disagreed (partially) with Martin Fowler’s liking for RDM. Personally i find it curious how negative and biased people are towards the procedural/functional way of organising functionality. I mean i get the whole object/encapsulation gig especially where we really are modeling complex objects. But conversely why don’t people ‘get’ that there are also very real practical (and logical) benefits for separating business logic from their related entities. I.e. that entities can, and should, be seen to be operated upon by different ‘agents’. just like oxygen has it’s place in lungs as well as your WRX. sure we can argue that O2 has functionality that is bound within it, but i’m not talking about that and in most systems we are building are concerned with how entities are acted upon and manipuated by various agents. IMHO ADM maps to the reality better since it trys not to make assumptions about what exact functionality is bound to which entities, within this or that context. Another interesting example of a particular pattern matching reality better is seen in how concurrent programming goes so far in solving problems in a variety of domains in which typical languages would stumble.

  9. I’m not sure about the analogy of the Anaemic Domain Model being the vocabulary of the business.

    Let me translate my last sentence as it would appear if defined by an Anaemic domain model.

    I the analogy the Anaemic Domain Model vocabulary business.

    You see, all the nouns are there, and their relationships to each other, but a voabulary needs verbs and adjectives, which is what the Rich Domain Model provides :)

  10. Hi Lindsay,

    I note that your sentence does not use the same words for both noun and verb – they are separate words, which is handy because it means that you can use your verbs to refer to different nouns, and use your nouns in conjunction with different verbs. The same applies to domain models. If you bind your functionality to the domain model, then the domain model is then unusable in other contexts within the business. The separation of domain model and business rule happened to allow just this sort of reuse.

    Of course the re-application of a business rule with different domain entities is tricky – but if you look into generic algorithms, you’ll see that they do just that job. the Standard Template Library is one of the most respected collectionb and algorithm libraries in any language, and it deliberately and consciously divided out algorithms (verbs) from collections (nouns) for exactly the reasons stated above. QED ;-)

Comments are closed.