Is it really impossible to choose between LINQ and Stored Procedures?

For the mathematician there is no Ignorabimus, and, in my opinion, not at all for natural science either. … The true reason why [no one] has succeeded in finding an unsolvable problem is, in my opinion, that there is no unsolvable problem. In contrast to the foolish Ignoramibus, our credo avers:
We must know,
We shall know.

It’s that time of the month again – when all of the evangelically inclined mavens of Readify gather round to have the traditional debate. Despite the fact that they’ve had similar debates for years, they tend to tackle the arguments with gusto, trying to find a new angle of attack from which to sally forth in defence of their staunchly defended position. You may (assuming you never read the title of the post :) be wondering what is it that could inspire such fanatical and unswerving devotion? What is it that could polarise an otherwise completely rational group of individuals into opposing poles that consider the other completely mad?

What is this Lilliputian debate? I’m sure you didn’t need to ask, considering it is symptomatic of the gaping wound in the side of modern software engineering. This flaw in software engineering is the elephant in the room that nobody talks about (although they talk an awful lot about the lack of space).

The traditional debate is, of course:

What’s the point of a database?

And I’m sure that there’s a lot I could say on the topic (there sure was yesterday) but the debate put me in a thoughtful mood. The elephant in the room, the gaping wound in the side of software engineering is just as simply stated:

How do we prove that a design is optimal?

That is the real reason we spend so much of our time rehearsing these architectural arguments, trying to win over the other side. Nobody gets evangelical about something they just know – they only evangelise about things they are not sure about. Most people don’t proclaim to the world that the sun will rise tomorrow. But like me, you may well devote a lot of bandwidth to the idea that the object domain is paramount, not the relational. As an object oriented specialist that is my central creed and highest article of faith. The traditional debate goes on because we just don’t have proof on either side. Both sides have thoroughly convincing arguments, and there is no decision procedure to choose between them.

So why don’t we just resolve it once and for all? The computer science and software engineering fraternity is probably the single largest focussed accumulation of IQ points gathered in the history of mankind. They all focus intensively on issues just like this. Surely it is not beyond them to answer the simple question of whether we should put our business logic into stored procedures or use an ORM product to dynamically generate SQL statements. My initial thought was “We Must Know, We Will Know” or words to that effect. There is nothing that can’t be solved given enough resolve and intelligence. If we have a will to do so, we could probably settle on a definitive way to describe an architecture so that we can decide what is best for a given problem.

Those of you who followed the link at the top of the post will have found references there to David Hilbert, and that should have given you enough of a clue to know that there’s a chance that my initial sentiment was probably a pipe dream. If you are still in the dark, I’m referring to Hilbert’s Entscheidungsproblem (or the Decision Problem in English) and I beg you to read Douglas Hofstadter’s magnificent Gödel, Escher, Bach – An eternal golden braid. This book is at the top of my all-time favourites list, and among the million interesting topics it covers, the decision problem is central.

The Decision Problem – a quick detour

One thing you’ll notice about the Entscheidungsproblem and Turing’s Halting Problem is that they are equivalent. They seem to be asking about different things, but at a deeper level the problems are the same. The decision problem asks whether there is a mechanical procedure to determine the truth of any mathematical statement. At the turn of the century they might have imagined some procedure that cranked through every derivation of the axioms of mathematical logic till it found a proof of the statement returning true. The problem with that brute-force approach is that mathematics allows a continual complexification and simplification of statements – it is non-monotonic. The implication is that just because you have applied every combination of the construction rules on all of the axioms up to a given length you can’t know whether there are new statements of the same length that could be found by the repeated application of growth and shrinkage rules that aren’t already in your list. That means that even though you may think you have a definitive list of all the true statements of a given length you may be wrong, so you can never give a false, only continue till you either find a concrete proof or disproof.

Because of these non-monotonic derivation rules, you will never be sure that no answer from your procedure means an answer of false. You will always have to wait and see. This is the equivalence between the Entscheidungsproblem and Alan Turing’s Halting Problem. If you knew your procedure would not halt, then you would just short-circuit the decision process and immediately answer false. If you knew that the procedure would halt, then you would just let it run and produce whatever true/false answer it came up with – either way, you would have a decision procedure. Unfortunately it’s not that easy, because the halting decision procedure has no overview of the whole of mathematics either, and can’t give an answer to the halting question. Ergo there is no decision procedure either. Besides, Kurt Gödel proved that there were undecidable problems, and the quest for a decision procedure was doomed to fail. he showed that even if you came up with a more sophisticated procedure than the brute force attack, you would still never get a decision procedure for all mathematics.

The Architectural Decision Problem

What has this got to do with deciding on the relative merits of two software designs? Is the issue of deciding between two designs also equivalent to the decision problem? Is it a constraint optimisation problem? You could enumerate the critical factors, assign a rank to them and then sum the scores for each design? That is exactly what I did in one of my recent posts entitled “The great Domain Model Debate – Solved!” Of course the ‘Solved!‘ part was partly tongue-in-cheek – I just provided a decision procedure for readers to distinguish between the competing designs of domain models.

One of the criticisms levelled at my offering for this problem was that my weights and scores were too subjective. I maintained that although my heuristic was flawed, it held the key to solving these design issues because there was the hope that there are objective measures of the importance of design criteria for each design, and it was possible to quantify the efficacy of each approach. But I’m beginning to wonder whether that’s really true. Let’s consider the domain model approach for a moment to see how we could quantify those figures.

Imagine that we could enumerate all of the criteria that pertained to the problem. Each represents an aspect of the value that architects place in a good design. In my previous post I considered such things as complexity, density of data storage, performance, maintainability etc. Obviously each of these figures varies in just how subjective it is. Complexity is a measure of how easy it is to understand. One programmer may be totally at home with a design whereas another may be confused. But there are measures of complexity that are objective that we could use. We could use that as an indicator of maintainability – the more complex a design is, the harder it would be to maintain.

This complexity measure would be more fundamental than any mere subjective measure, and would be tightly correlated with the subjective measure. Algorithmic complexity would be directly related to the degree of confusion a given developer would experience when first exposed to the design. Complexity affects our ability to remember the details of the design (as it is employed in a given context) and also our ability to mentally visualise the design and its uses. When we give a subjective measure of something like complexity, it may be due to the fact that we are looking at it from the wrong level. Yes, there is a subjective difference, but that is because of an objective difference that we are responding to.

It’s even possible to prove that such variables exist, so long as we are willing to agree that a subjective dislike that is completely whimsical is not worth incorporating into an assessment of a design’s worth. I’m thinking of such knee-jerk reactions like ‘we never use that design here‘ or ‘I don’t want to use it because I heard it was no good‘. Such opinions whilst strongly felt are of no value, because they don’t pertain to the design per-se but rather to a free-standing psychological state in the person who has them. The design could still be optimal, but that wouldn’t stop them from having that opinion. Confusion on the other hand has its origin in some aspect of the design, and thus should be factored in.

For each subjective criterion that we currently use to judge a design, there must be a set of objective criteria that cause it. If there are none, then we can discount it – it contributes nothing to an objective decision procedure – it is just a prejudice. If there are objective criteria, then we can substitute all occurrences of the subjective criterion in the decision procedure with the set of objective criteria. If we continue this process, we will eventually be left with nothing but objective criteria. At that point are we in a position to choose between two designs?

Judging a good design

It still remains to be seen whether we can enumerate all of the objective criteria that account for our experiences with a design, and its performance in production. It also remains for us to work out ways to measure them, and weigh their relative importance over other criteria. We are still in danger of slipping into a world of subjective opinions over what is most important. We should be able to apply some rigour because we’re aiming at a stationary target. Every design is produced to fulfil a set of requirements. Provided those requirements are fulfilled we can assess the design solely in terms of the objective criteria. We can filter out all of the designs that are incapable of meeting the requirements – all the designs that are left are guaranteed to do the job, but some will be better than others. If that requires that we formally specify our designs and requirements then (for the purposes of this argument) so be it. All that matters is that we are able to guarantee that all remaining designs are fit to be used. All that distinguishes them are performance and other quality criteria that can be objectively measured.

Standard practice in software engineering is to reduce a problem to its component parts, and attempt to then compose the system from those components in a way that fulfils the requirements for the system. Clearly there are internal structures to a system, and those structures cannot necessarily be weighed in isolation. There is a context in which parts of a design make sense, and they can only be judged within that context. Often we judge our design patterns as though they were isolated systems on their own. That’s why people sometimes decide to use design patterns before they have even worked out if they are appropriate. The traditional debate is one where we judge the efficacy of a certain type of data access approach in isolation of the system it’s to be used in. I’ve seen salesmen for major software companies do the same thing – their marks have decided they are going to use the product before they’ve worked out why they will do so. I wonder whether the components of our architectural decision procedure can be composed in the same way that our components are.

In the context that they’re to be used, will all systems have a monotonic effect on the quality of a system? Could we represent the quality of our system as a sum of scores of the various sub-designs in the system like this? (Q1 + Q2 + … + Qn) That would assume that the quality of the system is the sum of the quality of its parts which seems a bit naive to me – some systems will work well in combination, others will limit the damage of their neighbours and some will exacerbate problems that would have lain dormant in their absence. How are we to represent the calculus of software quality? Perhaps the answer lies in the architecture itself? If you were to measure the quality of each unique path through the system, then you would have a way to measure the quality of that path as though it was a sequence of operations with no choices or loops involved. You could then sum the quality of each of these paths weighted in favour of frequency of usage. That would eliminate all subjective bias and the impact of all sub designs would be proportional to the centrality of its role within the system as a whole. In most systems data access plays a part in pretty much all paths through a system, hence the disproportionate emphasis we place on it in the traditional debates.

Scientific Software Design?

Can we work out what these criteria are? If we could measure every aspect of the system (data that gets created, stored, communicated, the complexity of that data etc) then we have the physical side of the picture – what we still lack is all of those thorny subjective measures that matter. Remember though that these are the subjective measures that can be converted into objective measures. Each of those measures can thus be added to the mix. What’s left? All of the criteria that we don’t know to ask about, and all of the physical measurements that we don’t know how to make, or don’t even know we should make. That’s the elephant in the room. You don’t know what you don’t know. And if you did, then it would pretty immediately fall to some kind of scientific enquiry. But will we be in the same situation as science and mathematics was at the dawn of the 20th Century? Will we, like Lord Kelvin, declare that everything of substance about software architecture is known and all the future holds for us is the task of filling in the gaps?

Are these unknown criteria like the unknown paths through a mathematical derivation? Are they the fatal flaw that unhinges any attempt to assess the quality of a design, or are they the features that turns software engineering into a weird amalgam of mathematics, physics and psychology? There will never be any way for us to unequivocally say that we have found all of the criteria that truly determine the quality of a design. Any criteria that we can think of we can objectify – but it’s the ones we can’t or don’t think of that will undermine our confidence in a design and doom us to traditional debates. Here’s a new way to state Hilbert’s 10th Problem:

Is there a way to fully enumerate all of the criteria that determine the quality of a software design?

Or to put it another way

Will we know when we know enough to distinguish good designs from bad?

The spirit of the enlightenment is fading. That much is certain. The resurgence of religiosity in all parts of the world is a backward step. It pulls us away from that pioneering spirit that Kant called a maturing of the human spirit. Maturity is no longer needing authority figures to tell us what to think. He was talking about the grand program to roll back the stifling power of the church. In software design we still cling to the idea that there are authority figures that are infallible. When they proclaim a design as sound, then we use it without further analysis. Design patterns are our scriptures, and traditional wisdom the ultimate authority by which we judge our designs. I want to see the day when scientific method is routinely brought to bear on software designs. Only then will we have reached the state of maturity where we can judge each design on its objective merits. I wonder what the Readify Tech List will be like then?

13 comments

  1. >> Is a resume an object?

    LOL

    I’d say that a resume is an object graph, but I suppose there will be a Resume class in there somewhere. Why do you ask? Given up on scanning Word files after all?

  2. Mitch,

    Not so sure, myself. I guess we would just move on to new disputes about things that still haven’t been resolved yet. It just pisses me off that I can’t provide Proof that one design is better than another. Sure I can provide a convincing argument in favour of my view, but then the database zealots can do the same for their view. Meanwhile software engineering isn’t able to move beyond these disputes to more interesting/complex problems…

  3. I ask this question because i find that it does a good job of clarifying what an object is (and isn’t). A resume has no state in a typical website application. It’s a document, plain and simple, and should be treated as such (e.g. entity, dataset, etc … but to call it an object i find the source of much misunderstanding). Give up on parsing word docs, only in my dreams ;)

  4. And what is a ‘document’ but a form of congealed state? Granted the state may not change often – it will change as frequently as a person’s circumstances change. Besides – if it’s not code then it must be ‘state’. The fact that it lives in some weird-ass database (word doc) doesn’t mean it isn’t state. If that’s the case, then sooner or later you’re gonna want to deal with it as an object. If that’s not the case, then there is an “opportunity” for you to add the capabilities of your system…

  5. Here’s my view of it…

    Objects are information + behavior

    Note: + here implies, ‘inextricably bound to’

    I see resume data as being processed by a job seeker, a recruiter, a database, a parsing engine. The only real behavior or action that is inherently bound to the resume data is CRUD (and even that can be abstracted into a resume editor).

    The fact that information is mutable doesn’t make it an object (Re: our ‘stateless’ digression).

    Where a resume might become an object is if you’re coding up a resume object for a 3D game. Here the resume is something you can pick up and drop off a building. It then ‘floats’ to the street, it also makes a sound when you ‘tear’ it. That’s behavior coupled with the information (weight of the paper, resume content, color, font, page count, tensile strength of paper etc).

    In messaging systems I find the nomenclature much more informative. Here we talk about processes and messages. That’s analogous to objects and data, only here, I find there’s much less confusion and processes and messages since they match reality more naturally. E.g. Resume is sent to a recruiter, who processes the information and then sends an interview invitation to the candidate. The resume itself doesn’t do anything here, it just gets manipulated by the environment. It’s just information.

    My belief is that if we treat everything as objects, we are really missing an important distinction.

  6. I should also clarify that it isn’t really a question of whether or not the behavior should exist. It’s about where, and in what semantic form, it belongs.

    Personally, I prefer business services that act on (and respond with) information, we can call these things; processes, functions, behaviors, logic. Only when i’m forced to (usually by dynamic behavior) does a data entity become a fully fledged object (e.g. DB connection manager, socket manager, event sink etc).

    Whole programming environments are constructed on this ‘functional’ paradigm and some would argue that objects are avoidable altogether! Yaiks, i could get lynched for this kind of talk these days. As if i haven’t stated the point enough, I don’t subscribe to the idea that organizing behaviors in a service layer is somehow more risky or sub-optimal than organizing them into objects. My hunch is that the opposite may be true.

  7. Dude,

    That’s a very interesting point. I am coming at this from the perspective of ontology design (and in that I classify object oriented design as a kind of lightweight version of that more often than not). As such I am primarily interested in model what IS. A secondary task to that is the business of working out how to manipulate that thing for the application at hand. Lastly I worry about how do I need to modify that model so that it is convenient to manipulate.

    I think that going about the object modelling task in any other order is optimising in advance of getting the system working. The Resume’s resumeness is inherent in the thing, and what you use it for (i.e. dropping off buildings, generating PDFs etc etc) is an external behaviour that can be attached to the object. This is another reason why I prefer the anemic domain model over richer models – it ties the ontology/domain model to the application unnecessarily.

    As for how this relates to this blog post I don’t know ;^} other than that I can’t prove any of this, which is a shame. Until we have a calculus of designs, we will never be able to put such differences of approach to the test.

  8. indeed, it seems i’ve stepped into solution mode on more than one level here ;) but also, i think i’m just highlighting my preference for solutions to specific problems. i’ve never been that good at that pure math kind of thinking of meta-solutions to meta-problems. playing a LOT of chess has taught me that abstract philosophy (general principles), often ends up yeilding to concrete solutions to specific problems. or, maybe i’m just trying to avoid the fact that my head doesn’t easily digest your ideas ;)

  9. Perhaps it’s a matter of temperament. I think that an ounce of general is worth a pound of specific. I also think that there is nothing more complicated about general ideas over specific ones. But – they seldom have words of their own to describe them. Nouns are bent out of shape to describe abstractions, and it’s that terminological barrier than often creates confusion. How could an abstraction be more complicated than the specificities that it describes? It could only ever be simpler.

    I suppose the specifics are easier to visualise. But like I said, it’s good if you can find a generality – it saves you from having to deal with specifics. That’s what you do every time you create a class or a design pattern or framework. I suspect that unless you have luckily managed to retain some childlike zen purity of thought, you deal with stereotypes as well. ;-)

  10. It’s an interesting question.

    Sure, i understand that general principles are useful and act as tools that propel us to greater things. I have a running discussion with my brother on this topic of what a good framework should do. In that instance we are questioning whether or not they should challenge a developer (in order to expand their thinking) or to make it easier (and risk encouraging them to stop thinking and go with the status quo). But I digress.

    All i know is this. In chess if you follow classical theory and the general principles that fill 95% of the literature, you can not progress beyond club level (period). In share investing you can not beat the market consistently by following general principles (although some notable investors say they do, in reality they are sophisticated). In software development you don’t not build spectacularly efficient search engines using java and hibernate and GOF patterns alone (if at all).

    In chess you must almost completely abandon general principles for the concrete the higher up you go. In investing you must ruthlessly focus on the specific investment at hand and even subtle nuances in it’s business processes can make or break it. In software, one has to be open seeing some things as objects, and some things as resources, and some things as functors and somethings as whatever-they-are (without getting distracted by generality)

    I’m just putting forward that i have this deep hunch that the secrets to spectacular success in many areas are hidden in very specific and concrete knowledge. But I’m not saying that general principles aren’t useful in these situations or even that they aren’t the actual success factor, they may be in some cases.

    Put simply, my view is that general principles are part of the journey, but are not enough to get us to spectacularly good results, and that ironically when we have used them to get most of the way there… we often have to disregard them.

  11. True. Can’t disagree with that. But… (I will anyway, for the sake of prolonging an enjoyable discussion :)
    Well – the devil is always in the details. But I guess any good abstractions you can find will potentially make your life easier. That said, there are others to worry about – I found this very interesting article from one of the API usability experts at MS: http://brad_abrams.members.winisp.net/Projects/APIDesignPapers/MeasuringAPIUsability.pdf

    I guess I would have to keep my rampant abstractionism in check just as much as you would have to restrain your special-case-ism. :-)

    I have written elsewhere about the misery induced by polymorphism as well – so I am not fully sold on the techniques available in OO languages for doing abstraction. I just hate any kind of technological (or design) lock-in. I want the option to unmake bad decisions at a later date without having to pay a huge price – and I can’t see an easier way to achieve that than some kind of abstraction layer. Just because you are using encapsulation, of course, doesn’t mean that you aren’t exploiting your hidden and very specific knowledge to the full, you’re just insulating unrelated systems from any assumptions you happen to make in doing so. So if I do my job well, I should make it possible for you to create a great result, and for others to come along later and make sense of our results without having to ponder for weeks on end. Not an easy task I spose, but I think a good generalisation ought to be just as easy to understand, if not more so, than a custom solution.

Comments are closed.