Domain Modeling and Ontology Engineering
The semantic web is poised to influence us in ways that will be as radical as the early days of the Internet and World Wide Web. For software developers it will involve a paradigm shift, bringing new ways of thinking about the problems that we solve, and more-importantly bringing us new bags of tricks to play with.
One of the current favourite ways to add value to an existing system is through the application of data mining. Amazon is a great example of the power of data mining; it can offer you recommendations based on a statistical model of purchasing behaviour that are pretty accurate. It looks at what the other purchasers of a book bought, and uses that as a guide to make further recommendations.
What if it were able to make suggestions like this: We recommend that you also buy book XYZ because it discusses the same topics but in more depth. That kind of recommendation would be incredible. You would have faith in a recommendation like that, because it wasn’t tainted by the thermal noise of purchaser behaviour. I don’t know why, but every time I go shopping for books on computer science, Amazon keeps recommending that I buy Star Trek books. It just so happens that programmers are suckers for schlock sci-fi books, so there is always at least one offering amongst the CompSci selections.
The kind of domain understanding I described above is made possible through the application of Ontology Engineering. Ontology Engineering is nothing new – it has been around for years in one form or another. What makes it new and exciting for me is the work being done by the W3C on semantic web technologies. Tim Berners-Lee has not been resting on his laurels since he invented the World Wide Web. He and his team have been producing a connected set of specifications for the representation, exchange and use of domain models and rules (plus a lot else besides). This excites me, not least because I first got into Computer Science through an interest in philosophy. About 22 years ago, in a Sunday supplement newspaper a correspondent wrote about the wonderful new science of Artificial Intelligence. He described it as a playground of philosophers where for the first time hypotheses about the nature of mind and reality could be made manifest and subjected to the rigours of scientific investigation. That blew my mind – and I have never looked back.
Which brings us to the present day. Ontology engineering involves the production of ontologies, which are an abstract model of some domain. This is exactly what software developers do for a living, but with a difference. The Resource Description Framework (RDF) and the Web Ontology Language (OWL) are designed to be published and consumed across the web. They are not procedural languages – they describe a domain and its rules in such a way that inference engines can reason about the domain and draw conclusions. In essence the semantic web brings a clean, standardised, web enabled and rich language in which we can share expert systems. The magnitude of what this means is not clear yet but I suspect that it will change everything.
The same imperatives that drove the formulation of standards like OWL and RDF are at work in the object domain. A class definition is only meaningful in the sense that it carries data and its name has some meaning to a programmer. There is no inherent meaning in an object graph that can allow an independent software system to draw conclusions from it. Even the natural language labels we apply to classes can be vague or ambiguous. Large systems in complex industries need a way to add meaning to an existing system without breaking backwards compatibility. Semantic web applications will be of great value to the developer community because they will allow us to inject intelligence into our systems.
The current Web2.0 drive to add value to the user experience will eventually call for more intelligence than can practically be got from our massive OO systems. A market-driven search for competitiveness will drive the software development community to more fully embrace the semantic web as the only easy way to add intelligence to unwieldy systems.
In many systems the sheer complexity of the problem domain has led software designers to throw up their hands in disgust, and opt for data structures that are catch-all buckets of data. Previously, I have referred to them as untyped associative containers because more often than not the data ends up in a hash table or equivalent data structure. For the developer, the untyped associative container is pure evil on many levels – not least from performance, readability, and type-safety angles. Early attempts to create industry mark-up languages foundered on the same rocks. What was lacking was a common conceptual framework in which to describe an industry. That problem is addressed by ontologies.
In future, we will produce our relational and object oriented models as a side effect of the production of an ontology – the ontology may well be the repository of the intellectual property of an enterprise, and will be stored and processed by dedicated reasoners able to make gather insights about users and their needs. Semantically aware systems will inevitably out-compete the inflexible systems that we are currently working with because they will be able to react to the user in a way that seems natural.
I’m currently working on an extended article about using semantic web technologies with .NET. As part of that effort I produced a little ontology in the N3 notation to model what makes people tick. The ontology will be used by a reasoner in the travel and itinerary planning domain.
erson a owl:Class .
:Need a owl:Class .
eriodicNeed rdfs:subClassOf :Need .
:Satisfier a owl:Class .
:need rdfs:domain erson;
rdfs:range :Need .
:Rest rdfs:subClassOf :Need .
:Hunger rdfs:subClassOf :Need .
:StimulousHunger rdfs:subClassOf :Need .
:satisfies rdfs:domain :Satisfier;
rdfs:range :Need .
:Sleep a :Class;
rdfs:subClassOf :Satisfier ;
:satisfies :Rest .
:Eating a :Class;
:satisfies :Hunger .
:Tourism a :Class;
:satisfies :StimulousHunger .
In the travel industry, all travel agents – even online ones – are routed through centralised bureaus that give flight times, take bookings etc. The only way that an online travel agency can distinguish themselves is if they are more smart and easier to use. They are tackling the later problem these days with AJAX, but they have yet to find effective ways to be more smart. An ontology that understands people a bit better is going to help them target their offerings more ‘delicately’. I don’t know about you, but I have portal sites that provide you with countless sales pitches on the one page. Endless checkboxes for extra services, and links to product partners that you might need something from. As the web becomes more interconnected, this is going to become more and more irritating. The systems must be able to understand that the last thing a user wants after a 28 hour flight is a guided tour of London, or tickets to the planetarium.
The example ontology above is a simple kind of upper ontology. It describes the world in the abstract to provide a kind of foundation off which to build more specific lower ontologies. This one just happens to model a kind of Freudian drive mechanism to describe how people’s wants and desires change over time (although the changing over time bit isn’t included in this example). Services can be tied to this upper ontology easily – restaurants provide Eating, which is a satisfier for hunger. Garfunkle’s restaurant (a type of Restaurant) is less than 200 metres from the Cecil Hotel (a type of Hotel that provides sleeping facilities, a satisfier of the need to rest) where you have a booking. Because all of these facts are subject to rules of inference, the inference engines can deduce that you may want to make a booking to eat at the hotel when you arrive, since it will have been 5 hours since you last satisfied your hunger.
The design of upper ontologies is frowned upon mightily in the relational and object oriented worlds – it smacks of over-engineering. For the first time we are seeing a new paradigm that will reward deeper analysis. I look forward to that day