Month: January 2007

C# Automatic Properties: Semantic changes to help cope with poor syntax?

Steve Eichert and Bart de Smet both posted last November about a new C# feature: Automatic Properties.

The language enhancement will convert code like this in a class definition:

public string MyProperty { get; set; }

 

Into something like this:

private string myProperty;
public string MyProperty
{
get{return myProperty;}
set{myProperty = value;}
}

I’ve got to admit that I view this with a few misgivings. Primarily, I worry about how to interpret MyProperty in an abstract or virtual class. Will there be an implementation generated for an abstract class? Is the code generation skipped if the property is virtual? At first glance, it doesn’t seem to enhance to the clarity of the language. Currently { get; set; } has only one meaning – no implementation provided! Now we will have to look at the context of the declaration to know if it really means that at all! What will this do for those porting C# 2 code to C# 3?

Some of the commenters on Steve’s post suggested adding a new keyword to the language so that you could get code like this:

public property string MyProperty;

or

public readonly property string MyProperty;

It’s definitely less ambiguous than the syntax proposed by Anders Hejlsberg. I just wonder why it is necessary at all. What encapsulation is provided by such a feature? What are you getting? A property provides a degree of encapsulation to a field, which is why people declare properties to wrap fields. It allows you to add validation or some other source for the data. I have no objection to the C# team providing as much syntactic sugar as they like, but when they change the semantics of the language, I think they should definitely look before they leap! The only advantage to be gained from it is less typing. Hejlsberg must really hate typing! You can’t define a field in an interface so the empty property anti-pattern is obligatory in C# 2.0. Why enshrine it in the language though?

Are empty properties another one of those half-understood injunctions that developers pay lip service to? (I lump configuration in this category) After all, the only point of a property is if you do something in it other than turn a private variable into a public one. Why not make fields declarable in interfaces? That way, we’d only have to encapsulate them if they weren’t stored in the class or if we wanted to do something with them prior to changing state? It wouldn’t change the semantics of the language either, and it would be backwards compatible.

If I was obliged to add Automatic Properties to the language I might be tempted to make use of an attribute to annotate a field instead:

[Property("MyProperty")] private string myField;

Attributes are an excellent way to pass on hints to the compiler. They are more versatile than keywords, and wouldn’t require syntax changes. Keeping the language definition stable would probably please tool vendors too.

Agile – what happens if you don’t think ahead?

Alec, a man of great perspicacity, and wisdom, has granted us another Agile pearl of wisdom. All of which is totally correct. There’s no doubt that if you are with The Complicators, then you can screw up a product no matter what method you follow.

Let’s just try a thought experiment though – Imagine you’d been disciplined and hadn’t thought of the future at all, and you’d designed your system solely to work on RJE 2780. I am assuming that RJE 2780 is nothing like TCP/IP, so the design you produced would dovetail neatly with RJE 2780, but not TCP/IP. If the product was sufficiently complicated by that stage, your subsequent estimates for adding other protocols would have involved a sizable refactoring exercise, which judging by the attitude they took, would probably have led them to shelve the project in order to reduce expenses anyway, so either way you were screwed.

The middle ground that I tread these days is to acknowledge that this software (if it lives long enough) will be subject to revisions and maintenance that are often easy to predict – rather than build all the new functionality into the product in version 1, I create extensibility points so that it won’t be expensive when the time inevitably comes. That sort of design isn’t the n-th degree analysis paralysis, nor is it the head-in-the-sand avoidance of agile. It is doing just enough design to mitigate future catastrophes. As a complicator in remission, myself, I’d much prefer to do the n-th degree thing, but I resist it in the name of professionalism. (Bitching privately to the wife)

More on the Agile Debate

Alec recently replied to one of my Agile diatribes with a very interesting set of comments. His point is that no project will succeed if it is badly run. I’d be the first to admit that in some cases, I should probably have stood my ground or found a more stable compromise. But we have to admit that many project managers are likely to be less savvy or less principled than Alec.

The reason we learn as we go along in an Agile project is because we never asked enough questions to begin with! The whole point of an up-front intensive design phase is to bring as many of those ‘doh!’ moments ahead of the development phase as we can. Catching a design (or requirements) flaw midway through a project is definitely better than catching it after delivery. But catching them during the design process is even better. Everything changes around us because those who ought to have given more thought to the requirements have postponed thinking (indefinitely?). What can reasonably be delayed? My initial gripe is that one man’s reason is another man’s madness.

I’m not suggesting that we drop Agile and go back to waterfall models (although maybe we should for big projects). All I really want is that we feel obligated to think before we act. Is that so much to ask?

The Great Domain Model Debate – Solved!

In almost every version 1.0 system I design, I end up endlessly rehearsing the pros and cons of different implementations of the domain model (or lack of it). It’s getting so tedious that I recently decided to answer the question to my own satisfaction. I produced a spreadsheet with as many design factors that I could think of and the different models that I have worked with or considered over the years. I attached a weight to each design factor and then I assigned a score to each model for each design issue based on how well I thought it performed. I then summed the weighted scores for each model to produce a score for the whole model. I was glad to see that the Anemic Domain Model won, and was not surprised to see that performance, intelligibility and strong typing won out over cross-platform support and publishability.

I included a few models that I wouldn’t even dream of using, such as typed and untyped associative containers and raw XmlDocuments for the sake of not giving in (too much) to my own bias. As a matter of fact, typed associative containers (i.e. typed wrappers around HashTables) scored better than plain old DataSets or Raw XmlDocuments. These last two I have seen actively promoted by teams of Microsoft architects who ought to know better. Also unsurprisingly, the worst score came from the untyped associative container (i.e. HashTable or TreeSet or whatever). Nevertheless, this model is employed by a disproportionate number of lazy designers who can’t be bothered to design a decent object model. This is a particularly popular solution for configuration systems. It has even been enshrined as the solution of choice in the .NET framework! I guess I should take this chance to reiterate my call for designers and developers to avoid this model at all costs – it has absolutely no redeeming features!!!!

I also included what I have called Transactional DTOs which design was mentioned to me as a serious proposal by Mitch Denny. I have never used this approach, so I may not have done it justice. Even so it scored highly, coming in only behind typed data sets solely because I couldn’t find a way to base a system solely on them. As a result they score lowly on searchability under current and future APIs. If they were paired with an Anemic Domain Model then the system might be very elegant. I hope that readers can write in to report on good examples of this design in use. How about it Mitch? Have you written about this approach in any depth? I have never tried Naked Objects before either, and my understanding was based upon articles that I read a few years ago. Things may have moved on quite a bit. Check out this article for more.

Each cell represents my assessment (from 0 to 1) of the strength of the given implementation technique for each design factor. As I said before, some of these are based on off-the-cuff judgements without as much exposure as tried-and-true idioms. They are to a certain degree subjective, but I think I’ve been fair. The weights I expect to vary from project to project, but experience tells me you should flout them at your peril! The scores in Figure 2 represent how well each idiom scored for each design factor, and the overall scores at the bottom represent the overall quality of the model.

Figure 1. Weights on each design factor, and strength of each model type for that factor.

Figure 2. Scores of each idiom, based on the sum of the weighted scores.

As you can see, the Anemic Domain Model scored best because of its performance, strong typing, good support for relationships, encapsulation, and simplicity. That’s despite the fact that it underperforms on cross platform support, publishability and transactions. If you have specific non-functional requirements, then you might need to adjust the weights on Figure 1 to reassess the model you use. The chart doesn’t take into account other design considerations that might boost the score of certain models. Those include IDE support, non-standard API support, developer familiarity and published reference applications. It also doesn’t try to assess the cost of changing a model on legacy code, which is always likely to be prohibitive. The fact that it’s prohibitive means that you have to get it right first time. It also seems to imply that someone in Microsoft also did the same thing and finally realized that some of their sales pitches were not made in the best interests of their clients! It also explains why they have started making the transition to ORM technologies and Language Integrated Queries (LINQ). The benefits are quite clear when you tabulate them.

Lastly, I wonder whether there are quantitative assessments that can be applied to other semi-religious debates in software engineering. Can you think of any? I’m not going to consider linguistic debates – I’m thinking more design issues such as deciding between different display/navigation patterns or deciding whether to run stored procedures, dynamic queries, CLR stored procedures or whathaveyou. What do you agonize over most? Perhaps there is a systematic way for you to choose in future? Why don’t you choose some idiom that you have had a lot of experience with, and give it the same treatment? I’ll create a link from this page to yours. Perhaps if enough people produced models like this, then we could create a ready-reckoner for choosing designs.

kick it on DotNetKicks.com

I Hope Nathaniel Webster is Turning in his Grave

A little known linguistic mangling:

Aluminium‘ was not how Humphrey Davy initially named the element. First he called it Alumium, then he called it ‘Aluminum‘. So, is Aluminium a case of reverse defrancophonificationism?

 

Defrancophonificationism is of course a francophonificated coinage itself. Rather like the Euro… ;^}

Nondeterministic Finite Automaton (NDFA) in C#

Download the source: Example 1.

Sad to say, but my holidays are over, and I’m back to work. I tried pretty hard to keep my hands away from the laptop while I was off, but I got itchy fingers towards the end so I had a stab at implementing a non-deterministic finite automaton (NDFA). I implemented it to give me an excuse to play with the C5 collections library. As it turned out the class was relatively easy to implement as a deterministic finite automaton (DFA) but required a bit more finesse to extend it to the general case of the NDFA. Anyhow I got it working OK. Here’s how you might use it:

   1:  NDFA<QState, char, string> ndfa = new NDFA<QState, char, string>();
   2:  ndfa.AllStates.AddAll(new QState[] { QState.err, QState.q0, QState.q1, QState.q2 });
   3:  ndfa.AcceptStates.AddAll(new QState[] { QState.q2});
   4:  ndfa.StartState = QState.q0;
   5:  ndfa.ErrorState = QState.err;
   6:  ndfa.SetStateComparer(new QStateComparer<QState>());
   7:  ndfa.SetErrorHandler(delegate { Debug.WriteLine("Error State Entered"); });
   8:   
   9:  ndfa.TransitionTable.Add(new Rec<QState, char>(QState.q0, 'a'), QState.q1);
  10:  ndfa.TransitionTable.Add(new Rec<QState, char>(QState.q0, 'a'), QState.q2);
  11:  ndfa.TransitionTable.Add(new Rec<QState, char>(QState.q1, 'b'), QState.q3);
  12:  ndfa.TransitionTable.Add(new Rec<QState, char>(QState.q2, 'b'), QState.q3);
  13:   
  14:  TransitionFunction<QState, char, string> func =
  15:      delegate(INdfa<QState, char, string> idfa, QState q, QState qn, char i)
  16:      {
  17:          if (idfa.IsErrorState)
  18:              return "Error Occurred.";
  19:          return
  20:              string.Format("Transitioned from {0} to {1} because of input '{2}' ({3})", q,
  21:                            qn, i, idfa.IsInAcceptState ? "Accept State" : "Non-Accept State");
  22:      };
  23:   
  24:  ndfa.TransitionFunctions.Add(new Rec<QState, QState>(QState.q0, QState.q1), func);
  25:  ndfa.TransitionFunctions.Add(new Rec<QState, QState>(QState.q0, QState.q2), func);
  26:  ndfa.TransitionFunctions.Add(new Rec<QState, QState>(QState.q1, QState.q3), func);
  27:  ndfa.TransitionFunctions.Add(new Rec<QState, QState>(QState.q2, QState.q3), func);
  28:   
  29:  foreach (string output in ndfa.ProcessInput("ab".ToCharArray()))
  30:  {
  31:      Debug.WriteLine(output);
  32:  }

Example 1: Using the NDFA

This sample implements a simple state machine that diverges into two states and then converges back into a single accepting state:

being a generic class it can work as well with chars, ints or enums for the state. My example above uses a simple enum called QState, plus a comparator to allow states to be stored in an ordered tree collection to allow quick state transitions:

   1:  public enum QState : int
   2:  {
   3:      err,
   4:      q0,
   5:      q1,
   6:      q2,
   7:      q3
   8:  }
   9:   
Example 2. The states used by the NDFA
 
The Rec<A,B> class is a record class (tuple) that is defined in C5 for associative containers such as dictionaries. I based my comparer on Rec<Q,Q> because I needed it to order the transition table which stores the one to many mappings from state to state.
 
  10:  public class QStateComparer<Q> : IComparer<Rec<Q, Q>>
  11:  {
  12:      public int Compare(Rec<Q, Q> x, Rec<Q, Q> y)
  13:      {
  14:          int a = (13 * Convert.ToInt32(x.X1)) + Convert.ToInt32(x.X2);
  15:          int b = (13 * Convert.ToInt32(y.X1)) + Convert.ToInt32(y.X2);
  16:          return a - b;
  17:      }
  18:  }

Example 3. A comparer to allow QState to be used with the C5 TreeSet, HashBag and HashDictionary collections.

In example 1, line 14, I use an anonymous delegate to create a ‘transition function’. Sorry to use contradictory terminology – transition function is a term used to describe the function that is used to find the next state to be transitioned to. In my case though I have augmented the NDFA to allow a delegate to be invoked as each transition is made. This allows the NDFA to do useful work as it goes. In the case of the function on line 14, it just says what happened to cause the transition, without doing anything.

New LINQ Wiki Site Started

Troy Magennis has launched a wiki for LINQ, called Hooked on LINQ.

Please visit it, and enrich it! It might become a great source of information in the future, but only if we use it now. At the moment it needs content, and if you have written on the topic, please go there and at the very least link back to what you have written.