Month: November 2006

How would I spend my $100?

Joe Duffy is busy at work writing a book about concurrency in .NET. He asks how we would spend our $100, given a choice from a set of topics available. Well, I’d have to spend mine like this:

$5 – (0) Parallel algorithms (Comp Sci., technology agnostic)
$5 – (1) Architecting large scale concurrent programs
$5 – (2) Windows concurrency internals
$5 – (3) CLR concurrency internals
$5 – (4) Windows (Win32) concurrency API best practices
$5 – (5) CLR concurrency API best practices
$5 – (6) Client-side concurrency
$5 – (7) Server-side concurrency
$5 – (8) Reusable concurrency primitives (building and using them)
$5 – (9) Performance and scalability
$50 – (A) Debugging

And here’s why. Because debugging concurrent systems is Ruddy Hard!!!!! I wish I had $200 J

2c/

Automated parallelization of Sequential Systems

I recently came across the Dryad project at MS Research. It is concerned with the development of automated systems to deploy sequential systems across parallel platforms. This is an interesting and topical issue at the moment because the advent of multi-core processors poses problems for developers. In C#, at the very least, there is no inherent parallelization of the language. Developers have to explicitly code a system with threading or grid computing in mind. Coding in such a way is generally pretty hard, and fraught with peril to the unwary programmer, so any system that can ease the work required is likely to be finding its way into the .NET platform in the near future.

My impression from Dryad is that to make use of the multi-core processors we may have to change the way we design systems to allow automated parallelization. The Dryad project talks about a multileveled description of a system that incorporates declarations of the dataflow of the system as well as primitive operations to be performed on them. It seems to be targeted more at the massively computationally intensive operations traditionally performed in giant vector processors. What does it mean to those of us who develop distributed applications on web farms? I’m not sure, but I think that the trend will be the hot topic of the next few years.

At Tech Ed Sydney 2006, this issue was highlighted by Joel Pobar – he pointed out that the declarative elements of technologies like LINQ and F# provide a window of opportunity to allow the runtime to incorporate structures that will make the life of systems like Dryad easier. An interesting trend is the gradual productisation of automated analysis and proof systems. The coolest thing I’ve seen lately is the Spec# project, again from MS Research which augments the C# IDE with various kinds of static analysis and automated theorem provers. For a while, I was a researcher on the RAPIER at the Parallel Application Centre at the University Of Southampton, UK. The project sought to integrate design, development, simulation and theorem proof tools for the development of real-time systems. I can’t wait till such tools become the conventional static error checkers of IDEs like Visual Studio. It is discouraging that such tools are still not in the mainstream after 10 years of progress during which time software systems have become exponentially more complex.

No indestructible Lucite Shipit Awards any More?

I have to admit that my impression of what life must be like inside of Microsoft stems mostly from reading Microserfs by Douglas Coupland. As a result, I was disappointed to discover that when stuff gets shipped these days, they give each other cakes instead of indestructible Lucite plaques.

Here’s the proof:

My guess is they probably didn’t tie it to the back of a car and drive around the block either.

Boo.com to reincorporate: one for you Aggy

It appears that Boo has reincarnated to drag us down, once again. Of course, it won’t be possible without the help of Aggy Finn who was with them at the time*.

This is how he escaped last time:

If the contractor wages go down to 10GBP/hr again, Aggy, I shall be blaming you!!!!!

* Needless to say, most this is a pack of lies. Especially the bit about Aggy being responsible.

Tax on British Stupidity gets Hiked

According to this BBC report, the UK economy is losing billions to those Nigerian scams. Even though I don’t need to explain which scams to those of you who read this – they are proverbial – there are unspeakably avaricious and gullible cretins out there who still send their money into the Nigerian black economy.

I suppose if these people are moronic enough to think that the son/daughter of the Nigerian Mister for transport/embezzlement/finance  would single them out from the billions of other candidates to share their booty with, then they deserve what they get.

Besides, it’s more sociable than playing on slot machines, or having your mobile stolen – you get to exchange emails with these nice but desperate scions of once great African families.

Excellent Article on the Concepts behind C# 3.0

Thomas Petricek has written a very interesting article on the new concepts behind C# 3.0 (here). It shows the origin of many of the functional programming features found in C# 3.0 from and F#. Having explored a little of the code that backs up the functional programming aspects, I understand that although the extension run with the basic features of C# 2.0, there is a huge amount of C# code required to deliver the functional paradigm to C#. Most of that code provides complex code generation, type inference and declarative programming support.

In the first section on first class function support – I found on closer inspection (within LINQ at least) that these first class functions, are actually delivered through calls to the DynamicMethod method of System.Reflection.Emit. If you disassemble its code, you’ll see that the relationship between the imperative and functional programming in C# is through ‘runtime support‘. The functional programming extensions are a runtime extension to the CLR that generates code to fulfill declarative requirements. That is – there’s no new radical paradigm at work in the core code, but the way it’s exposed will simplify things so much that it might as well be called a new paradigm.

Well worth a read.

A Speculative Notation

 

About 6 years ago I read a book by Edward de Bono, called modestly the de Bono Code Book. His idea was to devise a representation for thoughts and statements that went beyond the simple propositions from predicate logic. Initially, I was intrigued by the idea that I could learn to represent ideas in a way that was both terse, unambiguous and that enhanced my abilities to reason on paper. I quickly became disillusioned by the notation because it was both arbitrary and incomprehensible; there was no obvious way (as far as I could make out) for representing an arbitrary thought. It was more geared to making statements about internal emotional states, rather than giving a way to represent any idea. I decided that it may be a futile effort to even try, but I had a go at producing one for myself.

De Bono’s notation used a string of digits each enhancing the meaning of preceding digits. Codes would be represented as a string of digits read from left to right that started with the general universe of discourse and elaborated upon the speaker’s feelings in that context. I felt it was impossible to create subtle distinctions in this notation, and even harder to see the similarities in related statements. I also felt that it ought to be possible to treat the notation as a kind of programming language of thought. I came up with a little notation that grew into something not unlike an ontological markup language.

The notation has an internal structure that is invariant: [ subject | verb I object ]. The structure of all utterances follows this format except in the case where the subject is yourself (and you are writing to yourself) where it can be omitted if necessary. I augmented that notation by allowing the adornment of the three components of a statement with ‘glyphs‘ that modified the meaning of the nouns and verbs of the statement. Glyphs could indicate certainty of belief or probability like in a modal logic, as well as placing the elements in space and time. I quickly needed to add set definitions and existential quantifiers, since many of language utterances end up defining sets and their properties. Don’t believe me? Look at the previous sentence: “most of language utterances end up defining sets and their properties“. I’m defining a property on the set of all utterances in a natural language! It’s not surprising that set theory was paramount among the early attempts at creating universal languages.

For the sake of consistency in my set definitions, I just borrowed notations from propositional and predicate logic since they were terse, unambiguous and widely adopted. I was influenced by the formal methods courses on Z at university where we nerdily tried to find ways to tell jokes in Z. I even managed to formally specify the song My Old Man’s a Dustman by Lonny Donnegan. Like I said, it was a fun but nerdy pastime. We found ourselves doing that when we studied object oriented languages too. So very soon I introduced set membership, and whole-part relationships using the object oriented ‘dot’ notation. The end result was a mélange of mathematics, object oriented notation, set theory and regular expressions. It had extensions to allow it to be easily used in the arena of human thought where there are no certainties and thought has declarative and imperative modes.

What follows is a little explanation of the system that I posted on Everything2 back in 2000. It describes the notation and how it would be used. It also alludes to a non-formal bootstrapped specification of the notation in a subset of the notation. I still haven’t come up with a good name for it, so at the time I dubbed it ‘N‘ for Notation for want of a better name. In subsequent years there has been a lot of activity at W3C on the topic of ontologies. RDF arose as a popular representation of ontologies that could be expressed in a machine processable form. RDF also follows a 3-tuple (triple) format with relationships defined in a similar way to this notation. Where the two deviate is that my notation is designed to be human produced and consumed. It seeks to pile as much meaning in a proposition as possible. It doesn’t use multiple statements to elaborate the original relationship in space time etc. you could unravel statements in this notation into RDF, and I’ve often thought about creating parsers to do just that. I’ve even produced formal grammars for the language.

I’ll post a full example, plus a bootstrapped spec in my next post. Meanwhile, this will give you a little taste of what it’s like. And if you can think of a good name for it, please leave a comment.

 

A notational system that combines features from linguistics, logic and object oriented programming to allow complex expressions to be uttered about everyday things.

Use is made of structures and glyphs that allow the gross structure of a sentence to remain constant while the specific context and meaning of the utterance is modified by the glyphs. Such constancy of form allows the meaning to be easily captured, without the complexity required by the syntactic transformations of languages like English.

Uses:

  • Speculative thought processes
  • Semi/Formal program analysis (Formal Methods)
  • Cleanroom Software Engineering Intended Functions
  • Dense notetaking
  • Semantic analysis of Natural Language Utterances
  • bootstrapping itself

Glyphs and Structures

The structures of N allow you to specify the causal structure of a thought, and then adorn it’s components with glyphs that specify more precisely the relationship between the bare content of the causal sequence described in the structure and the utterer of the sentence.

Formats exist for writing the notation a-la mathematics using TeX (still v-crude) and for using plain ASCII to communicate in a format such as this:

[N.structure_* | describe | causal sequence_*]

[N.glyph_* | adorn | N.structure.entity_*]

[N.glyph_* | modify_? | A.N.semantic]

 

Justification

Structures are employed to encode the causal structures that underlie utterances used in non-poetic (i.e. factual or working) language. Although linguistics distinguishes four broad types of utterance (phatic communion, informative, imperative, interrogative), according to the purposes to which they are put, it is possible to ignore phatic communion since it acts more like a protocol hand-shaking system rather than as a means of communicating. N exploits the regularity in the format of utterances in natural languages. That regularity is the grammar(s) of the language. I think that the grammar of the language reflects a mental model of causation in the brain of the speaker. Since all descriptive utterances in a natural language describe either static or dynamic situations. A dynamic state of affairs is the transition from one state of a system to another. A static description will describe the structure and properties of a system at a specific point in time. “Now” is the default time for most utterances. In general usage a static description would just be called a description. A dynamic description would be referred to as a description of an event.

As in English, N is able to render both static and dynamic descriptions, and within the same sentence. The structures used to describe events have three parts. They reflect the mental model that we use to recognize events. The structure contains a subject, verb and object. Natural languages vary in the order within a sentence that these entities come. Many languages including English included allow the ordering of these entities to vary. Or, to put it another way: English, the order of entities to vary will allow. N enforces a single order for entities in each utterance. The result is that it requires less work to interpret an utterance in N.

English, and all other natural languages, employ syntactic and morphemic transformations to the structure of a sentence, in order to specify details about the time, place, plurality, performer, and subject of an event. In so doing the gross structure of the sentence can be modified radically. For example, consider the sentence “I will drive the car.” and the sentence “I drove the car“. The transformation means that, to a casual glance, the sentences are different. On closer inspection, one can plainly see that they are the same sentence, modified to place the event in the future or past respectively. Similarly, “I am driving the car” incorporates new words and structures to indicate that the event is occurring in the present. The main transformations are to the verb “drive“. The verb is altered to indicate the tense of its operation. To adapt the tense to English sentence structure other syntactic changes must be incorporated. Both N and natural languages specify the role of the performer and subject of an event, but N does so explicitly, whereas natural languages often only give clues.

Dual Purpose Language

Frequently glyphs are used to clarify the relationship between the entities referred to in the bare content of the structure and the speaker of the utterance at the time and place that they uttered it. Language frequently serves two purposes then. The first is to describe an event, and the second is to describe the speaker’s relationship to it. That relationship can be spatio-temporal, moral, psychological etc.

Constant Structure Aids Gestalt Understanding

N is, in the formal presentation below, referred to as “N”. N uses glyphs (or graphical symbols) to adorn the basic present-tense form of a verb to indicate the tense modification. By doing so it does not require structural modifications to indicate the change in meaning of the sentence. For example the three driving examples given above would be represented in N as:

[I | drive_p | car] == “I drove the car.”

[I | drive_n | car] == “I am driving the car.”

[I | drive_f | car] == “I will drive the car.”

Speculativeness Aids Critical Thought

N has the advantage that with the use of glyphs one is able to progressively adorn the sentence which more meaning without changing the basic structure at all. N does not allow the writer to adorn a structure element with contradictory glyphs. In some cases one is able to give a visual representation to an English sentence structure that is not representable without additional glyphs of its own! Consider the following example.

[We | produce_f | Documentation] == “We will produce the documentation.”

 

might be adorned with the not sure (non-normative) glyph, which modifies the sentence to indicate that the item adorned, is unsure in our minds:

[We_? | produce_f | Documentation] == “Will ‘we produce the documentation.”

[We | \future{produce_?} | Documentation] == “Will we ‘produce the documentation.”

[We | produce_f | Documentation_?] == “Will we produce the ‘documentation.”

 

Under normal circumstances (N aside) one seldom inserts a linguist’s prime accent (‘) glyph into a sentence to stress a specific word. A prime accent requires active interpretation since its meaning can often be implicit and vague. There is no specific meaning to the prime accent either, so adornment with a prime will not enable the reader to comprehend what the prime signifies at a glance. So, as an alternative one has to adorn the whole sentence with an explanation:

[We_? | produce_f | Documentation] == “Is it us that will produce the documentation?”

[We | produce_f? | Documentation] == “Will we produce the documentation, or not produce it?”

[We | produce_f | Documentation_?] == “Will we produce the documentation, or something else?”

 

Clearly, the sentence has had to be expanded, modified, and adorned with the ‘?’ glyph anyway! Therefore it is safe to conclude that using glyphs is potentially much more powerful in what it can represent in a given space. It has the potential to allow one to represent something that might otherwise have been left ambiguous, which is a critical failure in all formal documents as well as other forms of communication. N may prove to be useful when using for example SMS messages where space is at such a premium that one already compresses speech beyond intelligible limits ;^}

An additional benefit of this adornment of static structures is that one is quickly able to view the “hidden” assumptions in a sentence. As in the example above, one can look at the basic units of meaning and their adornments and decide quickly whether one is completely sure that they should be there, and what part they play in the statement or instruction represented. Additional adornments allow the use of normative statements like “should we produce the documentation”. In this situation, as well, one can see the need to pull the sentence apart to signify which thing is open to choice.

For these reasons, I was tempted to call N a “speculative notation” since it allows its user to quickly root out the nature of their own mis-understandings, and address them with some sort of thought process. If the user is of a logical bent then they could quite easily use N within a formal proof framework. It has the benefit, though, that it is able to cleanly talk about complex human issues as well.

More Flexible Determiners Support Richer Thought Streams

When dealing with issues of non-trivial complexity, it is seldom possible to elucidate a train of thought without referring to more than one premise or sub-conclusion. The English language allows for just one level of “transferred subject”. By transferred subject I take to mean one that was previously the subject or object of a sentence. The transferred subject has to have been the subject or object of the immediately previous sentence. For example, consider the following sentences.

“N allows any number of determiners, over any scope. It also allows them to be modified in the same way as variables.”

The first sentence is easy:

[N | allow | determiner_* == d]

 

The second might be rendered with a structure such as the following:

[it | modify | them]=>[it.determiners =~ variables]

 

Obviously, the solution to disambiguate the English determiner ‘it’ is to substitute N for ‘it’ and d for determiner/many.

[N | modify | determiner]=>[N.determiners =~ variables]

 

This system works in exactly the same way as the determiners of English (the, that, it, them, etc). But consider what level of expansion would be required for the following group of statements

[consumer == C | receive_n | data]@local_host

[producer == P | transmit_p | data]@remote_host

[{P & C}.connection = "peer to peer connection"] == PPP

[{P & C & PPP} | provide_f | OSI.layer one]

 

These propositions (whether accurate or not ;) form the structure of a set of statements about the basic elements of a TCP/IP connection. The final statement refers to the set of producer, consumer and peer to peer connection as the subjects of its statement. The obvious translation would be something like “they provide an OSI layer”. But, the previous statement is likely to have used “they” to refer to the set of P and C. So to use they again, when a new entity has been defined would confuse the situation, therefore the speaker is unlikely to use it. They will probably have to refer to all of the entities in the set explicitly by enumerating them. The articles them, and that refer to the objects of previous sentences. The articles he, she, they and it refer to the subject of previous sentences.

It can be seen that such variable-like determiners allows the simplification of arguments, as well as greater brevity and specificity when referring to a previous entity. In addition the determiners allow something like

[bob | drive_p == D | Car]

 

Which we can then augment with a following statement

[D == [drive = {unsafe & reckless & too fast}]]

 

English does not allow the use of determiners to refer to verbs within previous sentences, without explicitly referring to the verbs as an object, which would confuse the intent of the sentence altogether.

Keeping a Developer’s Lab Book

Despite having used a laptop at work for years, I’ve always kept a paper notebook by my side. I depend on these notebooks. I even purchased the whole UK supply of Paperchase’s 500 page, leatherette, squared, rounded notebooks. These beauties are un-dog-ear-able, perfect for UML diagrams, lightweights and relatively inexpensive. The supply in the UK was dwindling when I discovered them in Cambridge. Apparently the Japanese manufactured them, and the Americans were buying whatever stocks were left. I acted fast and bought the outstanding stock from all the major branches in the UK. It only filled a small box, but ought to keep me supplied with notebooks for about the next 15 years. I doubt that with the current screen resolution, tablet PCs are going to topple them in my affections any time soon.

Anyway, I soon found that it’s all very well to have a notebook, but to make good use of it you should treat it as a kind of lab book. When I am stumped by a problem that I have been trying to tackle, I often find that deadline induced panic can lead me to blindly try every possible solution, one after the other, churning the code up and getting me nowhere. I find that at times like those I can enforce a bit of discipline on myself by using a lab book methodology to state the problem and work through to the solution. Obviously, the kind of information that a developer and a scientist at the bench need to maintain are very different, but the nature of what they do is similar. They march out into the unknown, partially armed with nothing to defend them but what they already know and some discipline. The timescales that developers are expected to deliver results in are much shorter, which is why they tend to panic and cut corners more often.

I start out by stating the problem. The key thing here is not to state what you think is causing the problem. Don’t say “the XML file is not well formed“, put “I can’t load the config file” or better still “the program won’t start“. A lot of the advice for keeping a lab book stems from laboratory work, where to make an experiment repeatable you must keep notes on what you are going to do and what the outcome was. Similarly, in a programmer’s notebook you need to keep track of what you did and the outcome in order to be able to rule out paths of enquiry. I generally tend to use the following headings:

  1. PROBLEM
  2. KNOWN
  3. IDEAS
  4. TESTS
  5. QUESTIONS

Each of these helps you to keep track of what you know about the problem, what ideas you had, and how they panned out. I’ve used this successfully on both bug solving and design issues.

In the PROBLEM area I make a statement of what the problem is, without making any assumptions about the cause of the problem. That can often lead me to make pretty dumb statements in this section initially. Later on, when you know more you can extend this or revise it with a more accurate statement of the problem. The key thing is not to prejudice the whole problem solving process by ruling out whole lines of enquiry prematurely.

After stating the problem, I make as many entries in the KNOWN section as I can. These will be bare statements of what I can be absolutely sure is true. Generally in the course of diagnosing a bug, I will already have tried a few things, before I resort to the lab book. I take note of these, plus anything else I know, such as requirements or constraints in the case of designs. Looking at these will eventually force you to produce a few ideas. These go in the IDEAS section.

Eventually, whether bug-squashing or designing, you will be faced with a problem and you’ll be at a loss. You need to get your head around the problem. That’s what the KNOWN section helps you to do. Ultimately, and inevitably, you will have an idea (probably lots of them). You write them down in the IDEAS section. This can be any kind of prejudiced statement you like. It is the correct place for statements like “The XML file is not well formed”. It’s a hypothesis that you will need to check out. In the case that ideas come like buses, you might want to note them all at once. Then check each one at a time, or pursue a promising line of enquiry first, then go back to the ideas backlog if you don’t get anywhere. You might also find that while your idea is easy to formulate, it can be hard to test. That’s where the TESTS section comes in.

In TESTS you describe ways to validate the hypothesis you made in the ideas section. You use these to make additions to the KNOWN section. Every time you perform one of the tests you should be able to add something to the KNOWN section. If you weren’t able to add anything to KNOWN, then your test was wasted. Quite often one of these tests will yield the solution you are after. If you are designing, these tests might be in the form of a proof of concept for a design or idiom.

If all else fails you need to start thinking about what you don’t know. The QUESTIONS section allows you to make use of questions as a way to prompt you to add new things to the known section or to devise new tests to get something to put in the known section. It is a kick start on your imagination – I always find that if I get to this stage, my questions never struggle to come out and before long they start turning into ideas.

You can think of it as an algorithm for generating knowledge about a domain. It’s very simple, and not exactly the heavyweight scientific method, but I know that when you get this method out, there isn’t a problem you can’t solve.

labbook1.PNG
Figure 1. A little flowchart showing how you feed information into the known section.

(more…)

Do legislators have a performance related pay scheme, or what?

If you needed any more graphic demonstration that legislators have too much time on their hands, read this. Apparently, trading standards people in the UK have forced a name change on a Welsh product called Dragon Sausages. Now, anyone but a complete imbecile (such as a trading standards officer, perhaps?) knows that a dragon features on the Welsh flag, and all Welsh people will know this, thus not being in any confusion about the nature of the product. Not so the Trading Standards Office, apparently.

There are three explanations that I can think of. 1. the Trading Standards Office is peopled by morons, 2. they think the general population are morons who still think the world is populated by fantastical beasts. 3. they have a performance related pay scheme where they have to produce a steady stream of rules for the rest of us to live by?

Any one of these alternatives is scary in the extreme. I think it may be a combination of all three.