Work

Easily Bored?

Darren Neimke posted some interesting thoughts today about the way developers lose their drive on a project, and how it’s reflected in SCRUM meetings. He thought that it might be due to the SCRUM meetings themselves. Daniel Crowley-Wilson has another idea – the developers are just bored.

Developers relish challenges and opportunities to do new things, and solve novel problems. As Daniel says, about midway through a project, there is little novelty in the problems left to be solved. At the end there is just the soul destroying finishing touches, which we all know have to be done, but which we hate doing.

I think that ‘good’ developers (Daniel’s phrase) are a particular breed. They are stimulus hungry people. They tend to quickly become immune to the initial piquancy of stimuli, entertainments or whatever interests them this week. They are not likely to remain interested in a domain or technology for long.

Evidence of this trait can be seen by the amount of staff turnover that most software vendors suffer, or the amount of technological churn that the developers tend to create. Those are the negatives – there’s also positives, like the rampant pace of forward progress. Developers can (given practice & solitude) sustain a high level of attention on a topic for long periods of time. I think this just exacerbates the problem of their easy boredom in the long run.

Because of the two characteristics of easy boredom and manic singlemindedness, Darren’s solution will probably not solve the problem either – the problem is that they require fresh inputs. Both in the job and in the SCRUM meetings. Perhaps your best bet, Darren, is to periodically change the format of the SCRUM meetings, and mix up the teams if you can.

Here’s a simple test to see whether a team member falls into this category – ask them some of the following:

  • how many hobbies have they had
  • how often do they change their desktop backgrounds
  • how frequently do they change jobs
  • how many projects do they have on the go, or up their sleeves
  • how many ideas for killer apps have they had (and not followed up)

Is it really impossible to choose between LINQ and Stored Procedures?

For the mathematician there is no Ignorabimus, and, in my opinion, not at all for natural science either. … The true reason why [no one] has succeeded in finding an unsolvable problem is, in my opinion, that there is no unsolvable problem. In contrast to the foolish Ignoramibus, our credo avers:
We must know,
We shall know.

It’s that time of the month again – when all of the evangelically inclined mavens of Readify gather round to have the traditional debate. Despite the fact that they’ve had similar debates for years, they tend to tackle the arguments with gusto, trying to find a new angle of attack from which to sally forth in defence of their staunchly defended position. You may (assuming you never read the title of the post :) be wondering what is it that could inspire such fanatical and unswerving devotion? What is it that could polarise an otherwise completely rational group of individuals into opposing poles that consider the other completely mad?

What is this Lilliputian debate? I’m sure you didn’t need to ask, considering it is symptomatic of the gaping wound in the side of modern software engineering. This flaw in software engineering is the elephant in the room that nobody talks about (although they talk an awful lot about the lack of space).

The traditional debate is, of course:

What’s the point of a database?

And I’m sure that there’s a lot I could say on the topic (there sure was yesterday) but the debate put me in a thoughtful mood. The elephant in the room, the gaping wound in the side of software engineering is just as simply stated:

How do we prove that a design is optimal?

That is the real reason we spend so much of our time rehearsing these architectural arguments, trying to win over the other side. Nobody gets evangelical about something they just know – they only evangelise about things they are not sure about. Most people don’t proclaim to the world that the sun will rise tomorrow. But like me, you may well devote a lot of bandwidth to the idea that the object domain is paramount, not the relational. As an object oriented specialist that is my central creed and highest article of faith. The traditional debate goes on because we just don’t have proof on either side. Both sides have thoroughly convincing arguments, and there is no decision procedure to choose between them.

So why don’t we just resolve it once and for all? The computer science and software engineering fraternity is probably the single largest focussed accumulation of IQ points gathered in the history of mankind. They all focus intensively on issues just like this. Surely it is not beyond them to answer the simple question of whether we should put our business logic into stored procedures or use an ORM product to dynamically generate SQL statements. My initial thought was “We Must Know, We Will Know” or words to that effect. There is nothing that can’t be solved given enough resolve and intelligence. If we have a will to do so, we could probably settle on a definitive way to describe an architecture so that we can decide what is best for a given problem.

Those of you who followed the link at the top of the post will have found references there to David Hilbert, and that should have given you enough of a clue to know that there’s a chance that my initial sentiment was probably a pipe dream. If you are still in the dark, I’m referring to Hilbert’s Entscheidungsproblem (or the Decision Problem in English) and I beg you to read Douglas Hofstadter’s magnificent Gödel, Escher, Bach – An eternal golden braid. This book is at the top of my all-time favourites list, and among the million interesting topics it covers, the decision problem is central.

The Decision Problem – a quick detour

One thing you’ll notice about the Entscheidungsproblem and Turing’s Halting Problem is that they are equivalent. They seem to be asking about different things, but at a deeper level the problems are the same. The decision problem asks whether there is a mechanical procedure to determine the truth of any mathematical statement. At the turn of the century they might have imagined some procedure that cranked through every derivation of the axioms of mathematical logic till it found a proof of the statement returning true. The problem with that brute-force approach is that mathematics allows a continual complexification and simplification of statements – it is non-monotonic. The implication is that just because you have applied every combination of the construction rules on all of the axioms up to a given length you can’t know whether there are new statements of the same length that could be found by the repeated application of growth and shrinkage rules that aren’t already in your list. That means that even though you may think you have a definitive list of all the true statements of a given length you may be wrong, so you can never give a false, only continue till you either find a concrete proof or disproof.

Because of these non-monotonic derivation rules, you will never be sure that no answer from your procedure means an answer of false. You will always have to wait and see. This is the equivalence between the Entscheidungsproblem and Alan Turing’s Halting Problem. If you knew your procedure would not halt, then you would just short-circuit the decision process and immediately answer false. If you knew that the procedure would halt, then you would just let it run and produce whatever true/false answer it came up with – either way, you would have a decision procedure. Unfortunately it’s not that easy, because the halting decision procedure has no overview of the whole of mathematics either, and can’t give an answer to the halting question. Ergo there is no decision procedure either. Besides, Kurt Gödel proved that there were undecidable problems, and the quest for a decision procedure was doomed to fail. he showed that even if you came up with a more sophisticated procedure than the brute force attack, you would still never get a decision procedure for all mathematics.

The Architectural Decision Problem

What has this got to do with deciding on the relative merits of two software designs? Is the issue of deciding between two designs also equivalent to the decision problem? Is it a constraint optimisation problem? You could enumerate the critical factors, assign a rank to them and then sum the scores for each design? That is exactly what I did in one of my recent posts entitled “The great Domain Model Debate – Solved!” Of course the ‘Solved!‘ part was partly tongue-in-cheek – I just provided a decision procedure for readers to distinguish between the competing designs of domain models.

One of the criticisms levelled at my offering for this problem was that my weights and scores were too subjective. I maintained that although my heuristic was flawed, it held the key to solving these design issues because there was the hope that there are objective measures of the importance of design criteria for each design, and it was possible to quantify the efficacy of each approach. But I’m beginning to wonder whether that’s really true. Let’s consider the domain model approach for a moment to see how we could quantify those figures.

Imagine that we could enumerate all of the criteria that pertained to the problem. Each represents an aspect of the value that architects place in a good design. In my previous post I considered such things as complexity, density of data storage, performance, maintainability etc. Obviously each of these figures varies in just how subjective it is. Complexity is a measure of how easy it is to understand. One programmer may be totally at home with a design whereas another may be confused. But there are measures of complexity that are objective that we could use. We could use that as an indicator of maintainability – the more complex a design is, the harder it would be to maintain.

This complexity measure would be more fundamental than any mere subjective measure, and would be tightly correlated with the subjective measure. Algorithmic complexity would be directly related to the degree of confusion a given developer would experience when first exposed to the design. Complexity affects our ability to remember the details of the design (as it is employed in a given context) and also our ability to mentally visualise the design and its uses. When we give a subjective measure of something like complexity, it may be due to the fact that we are looking at it from the wrong level. Yes, there is a subjective difference, but that is because of an objective difference that we are responding to.

It’s even possible to prove that such variables exist, so long as we are willing to agree that a subjective dislike that is completely whimsical is not worth incorporating into an assessment of a design’s worth. I’m thinking of such knee-jerk reactions like ‘we never use that design here‘ or ‘I don’t want to use it because I heard it was no good‘. Such opinions whilst strongly felt are of no value, because they don’t pertain to the design per-se but rather to a free-standing psychological state in the person who has them. The design could still be optimal, but that wouldn’t stop them from having that opinion. Confusion on the other hand has its origin in some aspect of the design, and thus should be factored in.

For each subjective criterion that we currently use to judge a design, there must be a set of objective criteria that cause it. If there are none, then we can discount it – it contributes nothing to an objective decision procedure – it is just a prejudice. If there are objective criteria, then we can substitute all occurrences of the subjective criterion in the decision procedure with the set of objective criteria. If we continue this process, we will eventually be left with nothing but objective criteria. At that point are we in a position to choose between two designs?

Judging a good design

It still remains to be seen whether we can enumerate all of the objective criteria that account for our experiences with a design, and its performance in production. It also remains for us to work out ways to measure them, and weigh their relative importance over other criteria. We are still in danger of slipping into a world of subjective opinions over what is most important. We should be able to apply some rigour because we’re aiming at a stationary target. Every design is produced to fulfil a set of requirements. Provided those requirements are fulfilled we can assess the design solely in terms of the objective criteria. We can filter out all of the designs that are incapable of meeting the requirements – all the designs that are left are guaranteed to do the job, but some will be better than others. If that requires that we formally specify our designs and requirements then (for the purposes of this argument) so be it. All that matters is that we are able to guarantee that all remaining designs are fit to be used. All that distinguishes them are performance and other quality criteria that can be objectively measured.

Standard practice in software engineering is to reduce a problem to its component parts, and attempt to then compose the system from those components in a way that fulfils the requirements for the system. Clearly there are internal structures to a system, and those structures cannot necessarily be weighed in isolation. There is a context in which parts of a design make sense, and they can only be judged within that context. Often we judge our design patterns as though they were isolated systems on their own. That’s why people sometimes decide to use design patterns before they have even worked out if they are appropriate. The traditional debate is one where we judge the efficacy of a certain type of data access approach in isolation of the system it’s to be used in. I’ve seen salesmen for major software companies do the same thing – their marks have decided they are going to use the product before they’ve worked out why they will do so. I wonder whether the components of our architectural decision procedure can be composed in the same way that our components are.

In the context that they’re to be used, will all systems have a monotonic effect on the quality of a system? Could we represent the quality of our system as a sum of scores of the various sub-designs in the system like this? (Q1 + Q2 + … + Qn) That would assume that the quality of the system is the sum of the quality of its parts which seems a bit naive to me – some systems will work well in combination, others will limit the damage of their neighbours and some will exacerbate problems that would have lain dormant in their absence. How are we to represent the calculus of software quality? Perhaps the answer lies in the architecture itself? If you were to measure the quality of each unique path through the system, then you would have a way to measure the quality of that path as though it was a sequence of operations with no choices or loops involved. You could then sum the quality of each of these paths weighted in favour of frequency of usage. That would eliminate all subjective bias and the impact of all sub designs would be proportional to the centrality of its role within the system as a whole. In most systems data access plays a part in pretty much all paths through a system, hence the disproportionate emphasis we place on it in the traditional debates.

Scientific Software Design?

Can we work out what these criteria are? If we could measure every aspect of the system (data that gets created, stored, communicated, the complexity of that data etc) then we have the physical side of the picture – what we still lack is all of those thorny subjective measures that matter. Remember though that these are the subjective measures that can be converted into objective measures. Each of those measures can thus be added to the mix. What’s left? All of the criteria that we don’t know to ask about, and all of the physical measurements that we don’t know how to make, or don’t even know we should make. That’s the elephant in the room. You don’t know what you don’t know. And if you did, then it would pretty immediately fall to some kind of scientific enquiry. But will we be in the same situation as science and mathematics was at the dawn of the 20th Century? Will we, like Lord Kelvin, declare that everything of substance about software architecture is known and all the future holds for us is the task of filling in the gaps?

Are these unknown criteria like the unknown paths through a mathematical derivation? Are they the fatal flaw that unhinges any attempt to assess the quality of a design, or are they the features that turns software engineering into a weird amalgam of mathematics, physics and psychology? There will never be any way for us to unequivocally say that we have found all of the criteria that truly determine the quality of a design. Any criteria that we can think of we can objectify – but it’s the ones we can’t or don’t think of that will undermine our confidence in a design and doom us to traditional debates. Here’s a new way to state Hilbert’s 10th Problem:

Is there a way to fully enumerate all of the criteria that determine the quality of a software design?

Or to put it another way

Will we know when we know enough to distinguish good designs from bad?

The spirit of the enlightenment is fading. That much is certain. The resurgence of religiosity in all parts of the world is a backward step. It pulls us away from that pioneering spirit that Kant called a maturing of the human spirit. Maturity is no longer needing authority figures to tell us what to think. He was talking about the grand program to roll back the stifling power of the church. In software design we still cling to the idea that there are authority figures that are infallible. When they proclaim a design as sound, then we use it without further analysis. Design patterns are our scriptures, and traditional wisdom the ultimate authority by which we judge our designs. I want to see the day when scientific method is routinely brought to bear on software designs. Only then will we have reached the state of maturity where we can judge each design on its objective merits. I wonder what the Readify Tech List will be like then?

Agile Programming Gripe

Alfred Thompson recently blogged about an interesting interview given by Bjarne Stroustrup, the inventor of C++, now professor of computer science at Texas A&M University. As ever the interviewers were after his prescription for the Silver Bullet that will save us all from ourselves. Alfred has already boiled down the interview to a couple of interesting sound bites, and I aim to do the same to his post. The quote that particularly caught my attention was Stroustrup’s comment on the ‘pragmatic‘ way in which modern software gets developed. When asked whether the solution might be to educate developers more fully, and to reward quality, and criticize sloppiness. Stroustrup points out that it will never work, because:

People reward developers who deliver software that is cheap, buggy, and first.

Let the Flame Wars Begin…

Alec has provided a few of the reasons he hates Microsoft. They all seem like sentimental attachment to other peoples litigation defeats… In honour of that, I have changed the colour of my blog for a day or two…

I too went through an undergraduate phase of anti-MS zealotry, and adopted obscure platforms as a kind of protest (i.e. Linux, Java, etc) then I went out to work, and realized that as far as the relevance of my skills is concerned, it didn’t matter whether the standards I knew were de jure or de facto, just so long as they were standard. Microsoft made it easier for me to pay the rent, so I am grateful that they were there doing what they do best.

Besides, I would have behaved in exactly the same way if I were in his position, so I can’t pretend to sit on the moral high-ground casting judgments at successful strategies designed to increase shareholder value. That was his job – he did it well.

Apart from that, it was just a little image that popped into my head during a discussion here at Readify – I did qualify my analysis as being a little bit dodgy so – caveat emptor!

Keeping a Developer’s Lab Book

Despite having used a laptop at work for years, I’ve always kept a paper notebook by my side. I depend on these notebooks. I even purchased the whole UK supply of Paperchase’s 500 page, leatherette, squared, rounded notebooks. These beauties are un-dog-ear-able, perfect for UML diagrams, lightweights and relatively inexpensive. The supply in the UK was dwindling when I discovered them in Cambridge. Apparently the Japanese manufactured them, and the Americans were buying whatever stocks were left. I acted fast and bought the outstanding stock from all the major branches in the UK. It only filled a small box, but ought to keep me supplied with notebooks for about the next 15 years. I doubt that with the current screen resolution, tablet PCs are going to topple them in my affections any time soon.

Anyway, I soon found that it’s all very well to have a notebook, but to make good use of it you should treat it as a kind of lab book. When I am stumped by a problem that I have been trying to tackle, I often find that deadline induced panic can lead me to blindly try every possible solution, one after the other, churning the code up and getting me nowhere. I find that at times like those I can enforce a bit of discipline on myself by using a lab book methodology to state the problem and work through to the solution. Obviously, the kind of information that a developer and a scientist at the bench need to maintain are very different, but the nature of what they do is similar. They march out into the unknown, partially armed with nothing to defend them but what they already know and some discipline. The timescales that developers are expected to deliver results in are much shorter, which is why they tend to panic and cut corners more often.

I start out by stating the problem. The key thing here is not to state what you think is causing the problem. Don’t say “the XML file is not well formed“, put “I can’t load the config file” or better still “the program won’t start“. A lot of the advice for keeping a lab book stems from laboratory work, where to make an experiment repeatable you must keep notes on what you are going to do and what the outcome was. Similarly, in a programmer’s notebook you need to keep track of what you did and the outcome in order to be able to rule out paths of enquiry. I generally tend to use the following headings:

  1. PROBLEM
  2. KNOWN
  3. IDEAS
  4. TESTS
  5. QUESTIONS

Each of these helps you to keep track of what you know about the problem, what ideas you had, and how they panned out. I’ve used this successfully on both bug solving and design issues.

In the PROBLEM area I make a statement of what the problem is, without making any assumptions about the cause of the problem. That can often lead me to make pretty dumb statements in this section initially. Later on, when you know more you can extend this or revise it with a more accurate statement of the problem. The key thing is not to prejudice the whole problem solving process by ruling out whole lines of enquiry prematurely.

After stating the problem, I make as many entries in the KNOWN section as I can. These will be bare statements of what I can be absolutely sure is true. Generally in the course of diagnosing a bug, I will already have tried a few things, before I resort to the lab book. I take note of these, plus anything else I know, such as requirements or constraints in the case of designs. Looking at these will eventually force you to produce a few ideas. These go in the IDEAS section.

Eventually, whether bug-squashing or designing, you will be faced with a problem and you’ll be at a loss. You need to get your head around the problem. That’s what the KNOWN section helps you to do. Ultimately, and inevitably, you will have an idea (probably lots of them). You write them down in the IDEAS section. This can be any kind of prejudiced statement you like. It is the correct place for statements like “The XML file is not well formed”. It’s a hypothesis that you will need to check out. In the case that ideas come like buses, you might want to note them all at once. Then check each one at a time, or pursue a promising line of enquiry first, then go back to the ideas backlog if you don’t get anywhere. You might also find that while your idea is easy to formulate, it can be hard to test. That’s where the TESTS section comes in.

In TESTS you describe ways to validate the hypothesis you made in the ideas section. You use these to make additions to the KNOWN section. Every time you perform one of the tests you should be able to add something to the KNOWN section. If you weren’t able to add anything to KNOWN, then your test was wasted. Quite often one of these tests will yield the solution you are after. If you are designing, these tests might be in the form of a proof of concept for a design or idiom.

If all else fails you need to start thinking about what you don’t know. The QUESTIONS section allows you to make use of questions as a way to prompt you to add new things to the known section or to devise new tests to get something to put in the known section. It is a kick start on your imagination – I always find that if I get to this stage, my questions never struggle to come out and before long they start turning into ideas.

You can think of it as an algorithm for generating knowledge about a domain. It’s very simple, and not exactly the heavyweight scientific method, but I know that when you get this method out, there isn’t a problem you can’t solve.

labbook1.PNG
Figure 1. A little flowchart showing how you feed information into the known section.

(more…)

The Indians are busy

I did a quick search on the new Google trends service, for the word 'algorithm'.

The top 6 cities out of the top 10, were in india.

Perhaps I should move to India. they get to use algorithms there!

Either way, it seems that that would be a fair indicator of the fact that western developers are writing less low level code than the indians? Either that or the india programmers need to refer to the web for help more often. Which do you think it is?

When a brill-o-pad goes bad…

…sometimes you get out your brill-o-pad with the best of intentions and it all goes bad. Nitrogen has had something installed on it that disagrees with it and performance is getting progressively worse. There are a few suspects: SQL Server 2005 CTP, VS.NET 2005 beta 2, and a limitless number of other crud that oughtn't to be on there. I think that whatever it was it infected N via the settings migration wizard that XP uses to port My Docs etc between systems. This performance degradation is inherited from its previous setup.

I think this is a form of karmic punishment, for no sooner was Nitrogen reincarnated that it was beset by the ills and sins of its previous life. If only there were a way to enter the enlightened state of having a hardware platform that natively runs an emulation layer that can be saved, rolled back and otherwise fiddled with. In fact if that were the case you could have an easy way to start running VMs on third party machines (you could access them via VNC for instance). That way you could lease a clean installation with some additional storage space where you could put your data. The OS and apps could be reinstantiated for you each time you run the computer, and then connected to the the data disk that stores what you were doing last time.

I think that if the price (and performance) was right I would consider running such a virtualised machine. Especially if I could alternate between different OS's depending on my requirements.

Question: what technique could OS manufacturers use to store settings modifications? If you had a transactional file system that allowed you to rollback, you could wipe out changes if they proved to be negative. But what if you only discovered the problem after performing lots of beneficial changes? I wonder how difficult it would be to have an OS that set baselines on the file system so that you could identify a specific set of changesand excise them without removing subsequent changes as well. How much overhead would that require? Would be enormous.

Programming Gem of the day

I think this one requires no disparaging commentary, it's funereal dissatisfactoryness stands as a balefire to all of you who want to write code that is both tenebrous and brittle.

adCol.Add(new
  Advertisement(dr.GetString(0),
    dr.GetDateTime(1),dr.GetString(2),
    dr.GetString(3), dr.GetInt32(4),
    dr.GetBoolean(5), dr.GetFloat(6),
    dr.GetString(7), dr.GetString(8),
    dr.GetString(9),
    dr.GetDateTime(10),
    (int)dr.GetInt16(11),
    (int)dr.GetInt16(12),
    (int)dr.GetInt16(13),
    dr.GetString(14),
    dr.GetString(15),
    dr.GetString(16),
    dr.GetString(17), dr.GetInt32(18),
    dr.GetString(19),
    dr.GetString(20),
    dr.GetString(21).ToCharArray()[0],
    dr.GetString(22).ToCharArray()[0],
    dr.GetString(23)));

This example has been formatted for extra readability – it was on a single line.
I would also like to point out that I had ABSOLUTELY NOTHING TO DO WITH THE PRODUCTION OF THIS PIECE OF CODE!

And another thing…

Dom suggested that "The distance to our goals always seems further to those realists without sufficient knowledge.". My personal experience in software indicates that when you are attempting to solve a difficult problem with many unknowns, you cannot predict the degree of complexity of the problem or solution without actually solving the problem. Good estimation in a software project is an exercise in pessimism. I also find that my pessimism is never enough! At the beginning of a project I am filled with youthful exuberance and optimism that blinds me to the fact that when I am waist deep in complexity (and office politics) my exuberance isn't enough to get me through the project – at that point progress throttles back to the baseline progress afforded by grim determination. This is a very good description of the AI research community, and presumably many other areas of science as well.

A war is equivalent to the deadlines-looming stage of a project where we always seem to pull miraculous rabbits out of our arses. The plateau period of slow progress cause by disillusionment can be seen at work in the AI research community now. Anti-results such as Minsky's proof of the limitations of certain types of neural networks and the failure to make quick progress have muted the youthful exuberance of the AI community and it is now in baseline progress mode.

This probing of the unknown reminds me of Turing's Halting Problem – For certain problems where information is lacking, the only way to work out whether a program will halt is to run it. When a program over-runs you can't know whether it is about to halt or not. So there is a basic undecidability about complex software that is reflected in software projects. This is the "software crisis" we were all taught so much about in university – the reason why more than 50% of major projects go over budget, are late, or get scrapped altogether. We were taught that there is no silver bullet and that the only way to overcome the crisis was to adopt strict programming discipline and make use of automated proof systems to identify items of code where halting and correctness were not possible.

But we live in a capitalist economy whose driving force is the market. Software engineers are required to maximize functionality and minimize costs so formal methods. Proofs are not an option. Consequently we make conservative estimates about what is possible with a given number of programmers in a fixed time. I see no difference between this working environment and that of scientists who are also driven to produce short term results for their investors with limited resources – they don't even know if there is a solution to the problem – they just know that there always has been before.

So, Dom, I think I'm suggesting that rational pessimism is justified when predicting the time taken to perform a poorly specified task of great complexity, and I can prove it!!!!!!