Domain Modeling and Ontology Engineering


The semantic web is poised to influence us in ways that will be as radical as the early days of the Internet and World Wide Web. For software developers it will involve a paradigm shift, bringing new ways of thinking about the problems that we solve, and more-importantly bringing us new bags of tricks to play with.

One of the current favourite ways to add value to an existing system is through the application of data mining. Amazon is a great example of the power of data mining; it can offer you recommendations based on a statistical model of purchasing behaviour that are pretty accurate. It looks at what the other purchasers of a book bought, and uses that as a guide to make further recommendations.

What if it were able to make suggestions like this: We recommend that you also buy book XYZ because it discusses the same topics but in more depth. That kind of recommendation would be incredible. You would have faith in a recommendation like that, because it wasn’t tainted by the thermal noise of purchaser behaviour. I don’t know why, but every time I go shopping for books on computer science, Amazon keeps recommending that I buy Star Trek books. It just so happens that programmers are suckers for schlock sci-fi books, so there is always at least one offering amongst the CompSci selections.

The kind of domain understanding I described above is made possible through the application of Ontology Engineering. Ontology Engineering is nothing new – it has been around for years in one form or another. What makes it new and exciting for me is the work being done by the W3C on semantic web technologies. Tim Berners-Lee has not been resting on his laurels since he invented the World Wide Web. He and his team have been producing a connected set of specifications for the representation, exchange and use of domain models and rules (plus a lot else besides). This excites me, not least because I first got into Computer Science through an interest in philosophy. About 22 years ago, in a Sunday supplement newspaper a correspondent wrote about the wonderful new science of Artificial Intelligence. He described it as a playground of philosophers where for the first time hypotheses about the nature of mind and reality could be made manifest and subjected to the rigours of scientific investigation. That blew my mind – and I have never looked back.

Which brings us to the present day. Ontology engineering involves the production of ontologies, which are an abstract model of some domain. This is exactly what software developers do for a living, but with a difference. The Resource Description Framework (RDF) and the Web Ontology Language (OWL) are designed to be published and consumed across the web. They are not procedural languages – they describe a domain and its rules in such a way that inference engines can reason about the domain and draw conclusions. In essence the semantic web brings a clean, standardised, web enabled and rich language in which we can share expert systems. The magnitude of what this means is not clear yet but I suspect that it will change everything.

The same imperatives that drove the formulation of standards like OWL and RDF are at work in the object domain. A class definition is only meaningful in the sense that it carries data and its name has some meaning to a programmer. There is no inherent meaning in an object graph that can allow an independent software system to draw conclusions from it. Even the natural language labels we apply to classes can be vague or ambiguous. Large systems in complex industries need a way to add meaning to an existing system without breaking backwards compatibility. Semantic web applications will be of great value to the developer community because they will allow us to inject intelligence into our systems.

The current Web2.0 drive to add value to the user experience will eventually call for more intelligence than can practically be got from our massive OO systems. A market-driven search for competitiveness will drive the software development community to more fully embrace the semantic web as the only easy way to add intelligence to unwieldy systems.

In many systems the sheer complexity of the problem domain has led software designers to throw up their hands in disgust, and opt for data structures that are catch-all buckets of data. Previously, I have referred to them as untyped associative containers because more often than not the data ends up in a hash table or equivalent data structure. For the developer, the untyped associative container is pure evil on many levels – not least from performance, readability, and type-safety angles. Early attempts to create industry mark-up languages foundered on the same rocks. What was lacking was a common conceptual framework in which to describe an industry. That problem is addressed by ontologies.

In future, we will produce our relational and object oriented models as a side effect of the production of an ontology – the ontology may well be the repository of the intellectual property of an enterprise, and will be stored and processed by dedicated reasoners able to make gather insights about users and their needs. Semantically aware systems will inevitably out-compete the inflexible systems that we are currently working with because they will be able to react to the user in a way that seems natural.

I’m currently working on an extended article about using semantic web technologies with .NET. As part of that effort I produced a little ontology in the N3 notation to model what makes people tick. The ontology will be used by a reasoner in the travel and itinerary planning domain.

:Person a owl:Class .
:Need a owl:Class .
:PeriodicNeed rdfs:subClassOf :Need .
:Satisfier a owl:Class .
:need rdfs:domain :Person;
rdfs:range :Need .
:Rest rdfs:subClassOf :Need .
:Hunger rdfs:subClassOf :Need .
:StimulousHunger rdfs:subClassOf :Need .
:satisfies rdfs:domain :Satisfier;
rdfs:range :Need .
:Sleep a :Class;
rdfs:subClassOf :Satisfier ;
:satisfies :Rest .
:Eating a :Class;
rdfs:subClassOf :Satisisfier;
:satisfies :Hunger .
:Tourism a :Class;
rdfs:subClassOf :Satisisfier;
:satisfies :StimulousHunger .

In the travel industry, all travel agents – even online ones – are routed through centralised bureaus that give flight times, take bookings etc.  The only way that an online travel agency can distinguish themselves is if they are more smart and easier to use. They are tackling the later problem these days with AJAX, but they have yet to find effective ways to be more smart. An ontology that understands people a bit better is going to help them target their offerings more ‘delicately’. I don’t know about you, but I have portal sites that provide you with countless sales pitches on the one page. Endless checkboxes for extra services, and links to product partners that you might need something from. As the web becomes more interconnected, this is going to become more and more irritating. The systems must be able to understand that the last thing a user wants after a 28 hour flight is a guided tour of London, or tickets to the planetarium.

The example ontology above is a simple kind of upper ontology. It describes the world in the abstract to provide a kind of foundation off which to build more specific lower ontologies. This one just happens to model a kind of Freudian drive mechanism to describe how people’s wants and desires change over time (although the changing over time bit isn’t included in this example). Services can be tied to this upper ontology easily – restaurants provide Eating, which is a satisfier for hunger. Garfunkle’s restaurant (a type of Restaurant) is less than 200 metres from the Cecil Hotel (a type of Hotel that provides sleeping facilities, a satisfier of the need to rest) where you have a booking. Because all of these facts are subject to rules of inference, the inference engines can deduce that you may want to make a booking to eat at the hotel when you arrive, since it will have been 5 hours since you last satisfied your hunger.

The design of upper ontologies is frowned upon mightily in the relational and object oriented worlds – it smacks of over-engineering. For the first time we are seeing a new paradigm that will reward deeper analysis. I look forward to that day

StumbleUpon Toolbar Stumble It!

About these ads

4 comments

  1. Thanks for this excellent introduction to ontology. I used to think about it a lot more in the context of modeling workflow dependencies, where data inputs and outputs were annotated with application characteristics and rules for how they could fit together. However, I found that the approach led to much”fine tuning” of the annotations and rules. I think that the users were in the end happier to remove or wildcard many of the annotations, and let incorrect or mistimed input cause a downstream application to crash and burn, and just deal with the results later.

    For example of what I’m talking about, application A might produce output OA, application B might require input OA, and B might further require attributes set on OA like application version. And users would get around it by saying ‘*’ for application version, thus making the pair (A,B) into an “untyped associative” relationship of sorts. These annotations were usually set in a user script which would be processed to figure out the workflow and execute the required applications.

    I tried taking the pain away by abstracting the common dependencies and rules into a configuration file, which is where I think the ontology is in all this. I used to call this file “the ontology”. But it required a lot of maintenance. A lot. Of maintenance.

    Did I mention that it required a lot of maintenance?

    Most of the problem was that the users didn’t understand how to maintain the file, but I perhaps I was approaching the problem in the wrong way. I wasn’t using an ontology or rules “engine” of any kind, maybe it would have been better to do so. Certainly your post has gotten me thinking about it again, and thank you for that!

  2. ggraham412,

    I’m glad you found the post useful, although you shouldn’t take what I say about ontology engineering too seriously – this post was long on sentiment and short on details!

    I’ve written extensively about configuration in the past (see the configuration tab above for links). I’ve got to admit that I’ve given some thought recently to how I could use an ontology to give a more intelligent description of a system and how it should work. i have great hopes of Microsoft’s DSI (Dynamic Systems Initiative) for providing us with a detailed description of the runtime environment of a system, and how it should behave. These are what i have called policy settings, and I’m still not convinced that they should be exterenalised, although a database backed ontology is still preferable to the kind of untyped assiciative containere that comes as default in environments like Java and .NET. One of the marks of progress over the last 10 years is the retreat of unstructured data stores in enterprise applications. Configuration files are the last bastion of this kind of ‘I dunno what to do with this’ stuff.

    Perhaps your configuration based ontology was perhaps high-maintenance because it was not subject to the same rigours of analysis that you would have given it if it were an object model inside the system? I’m just guessing, of course. I have certainly learned over the years that ‘users’ cannot be trusted to obey rules of good form, or to deduce the spirit of a design. If fact, it’s pretty rare for other developers to follow a complex scheme without iron-clad rules of some sort. perhaps you might have been better able to give them the control they needed through an administration console, where you could enforce the rules of good form with validation etc.

    The thing I was aiming at with this post was to introduce my readers to the idea that an upper ontology can be beneficial. upper ontologies are really abstract, in the sense that they often describe abstract concepts or intangibles that are seldom modelled in an object oriented domain model. My little model in this post is not really that deep from a philosophical standpoint. But from an application development perspective it goes many levels deeper than a normal systems designer would bother with. When was the last time you saw a system that modeled the state of mind of the user?

    It’s becoming fashionable to do this sort of thing now though – emotional computers and systems are coming out of research labs with increasing regularity. Researchers are acknowledging that a task focussed model produced by your average programmer does not mesh very well with the users they are written for. I hope that deeper ontologies can help with that, and they could well be one of the most facile ways that we as developers can exploit these ideas. An ontology could form the basis of a good object oriented design, and address the shortcomings inthe user experience at the same time.

    That would be refreshing, eh?

    Andrew

Comments are closed.