Computer Science

I’ve included a lot of stuff here that doesn’t really deserve to be here such as programming and web 2.0 etc. But its how I think about the whole thing, so I guess it counts.

Automata-Based Programming With Petri Nets – Part 1

Petri Nets are extremely powerful and expressive, but they are not as widely used in the software development community as deterministic state machines. That’s a pity – they allow us to solve problems beyond the reach of conventional state machines. This is the first in a mini-series on software development with Petri Nets. All of the code for a full feature-complete Petri Net library is available online at GitHub. You’re welcome to take a copy, play with it and use it in your own projects. Code for this and subsequent articles can be found at http://github.com/aabs/PetriNets.

(more…)

Functional Programming in C# – Higher-Order Functions

  1. Functional Programming – Is it worth your time?
  2. Functional Programming in C# – Higher-Order Functions

This is the second in a series on the basics of functional programming using C#. My topic today is one I touched on last time, when I described the rights and privileges of a function as a first class citizen. I’m going to explore Higher-Order Functions this time. Higher-Order Functions are functions that themselves take or return functions. Meta-functions, if you like.

As I explained last time, my programming heritage is firmly in the object-oriented camp. For me, the construction, composition and manipulation of composite data structures is second nature. A higher-order function is the equivalent from the functional paradigm. You can compose, order and recurse a tree of functions in just the same way as you manipulate your data. I’m going to describe a few of the techniques for doing that using an example of pretty printing some source code for display on a web site.

I’ve just finished a little project at Readify allowing us to conduct code reviews whenever an interesting code change gets checked into our TFS servers. A key feature of that is pretty-printing the source before rendering it. Obviously, if you’re displaying XHTML on an XHTML page, your browser will get confused pretty quickly unless you take steps to HTML-escape all the XHTML entities that might corrupt the display. The examples I’ll show will highlight the difference between the procedural and functional approaches.

This example shows a fairly typical implementation that takes a file that’s been split into lines:

public static string[] RenderLinesProcedural(string[] lines)
{
    for (int i = 0; i < lines.Count(); i++)
    {
      lines[i] = EscapeLine(lines[i]);
    }
    return lines;
}

public static string EscapeLine(string line)
{
  Debug.WriteLine(“converting ” + line);
  return line.Replace(” “, ”  “)
      .Replace(“\t”, ”  “)
      .Replace(“<”, “<”)
      .Replace(“>”, “>”);
}

There’s a few things worth noticing here. In C#, strings are immutable. That means that whenever you think that you are changing a string, you’re not. In the background, the CLR is constructing a modified copy of the string for you. The Array of strings on the other hand is not immutable, therefore a legitimate procedural approach is to make an in-place modification of the original collection and pass that back.  The EscapeLine method repeatedly makes modified copies of the line string passing back the last copy.

Despite C# not being a pure functional programming language[1], it’s still doing a lot of copying in this little example. My early impression was that pure functional programming (where all values are immutable) would be inefficient because of all the copying goign on. Yet here is a common-or-garden object oriented language that uses exactly the same approach to managing data, and we all use it without a qualm. In case you didn’t know, StringBuilder is what you should be using if you need to make in-place modifications to strings.

Let’s run the procedural code and record what happens:

private static void TestProcedural()
{
   string[] originalLines = new string[] { “<head>”, “</head>” };
   Debug.WriteLine(“Converting the lines”);
   IEnumerable<string> convertedStrings = RenderLinesProcedural(originalLines);
   Debug.WriteLine(“Converted the lines?”);

   foreach (string s in convertedStrings)
    {
    Debug.WriteLine(s);
    }
}

Here’s the output:

clip_image001

As you can see, the lines all got converted before we even got to the “converted the lines?” statement. That’s called ‘Eager Evaluation’, and it certainly has its place in some applications. Now lets use Higher-Order Functions:

public static IEnumerable<string> RenderLinesFunctional(IEnumerable<string> lines)
{
    return lines.Map(s => EscapeString(s));
}

static IEnumerable<R> Map<T, R>(this IEnumerable<T> seq, Func<T, R> f)
{
   foreach (var t in seq)
     yield return f(t);
}

static string EscapeString(string s)
{
   Debug.WriteLine(“converting ” + s);
   return s.Replace(”  “, “&nbsp;&nbsp;”)
     .Replace(“\t”, “&nbsp;&nbsp;”)
     .Replace(“<”, “&lt;”)
     .Replace(“>”, “&gt;”);
}

private static void TestFunctional()
{
   string[] originalLines = new string[] { “<head>”, “</head>” };
   Debug.WriteLine(“Converting the lines”);
   IEnumerable<string> convertedStrings = RenderLinesFunctional(originalLines);
   Debug.WriteLine(“Converted the lines?”);

   foreach (string s in convertedStrings)
    {
    Debug.WriteLine(s);
    }
}

This time the output looks different:

clip_image001[5]

At the time that the “Converted the Lines?” statement gets run, the lines have not yet been converted. This is called ‘Lazy Evaluation[2]‘, and it’s a powerful weapon in the functional armamentarium. For the simple array that I’m showing here, the technique looks like overkill but imagine that you were using a paged control on a big TFS installation like Readify’s TFSNow. You might have countless code reviews going on. If you rendered every line of code in all the files being viewed, you would waste both processor and bandwidth resources needlessly.

So what did I do to change the way this program worked so fundamentally? Well the main thing was to opt to use the IEnumerable interface, which then gave me the scope to provide an alternative implementation to representing the collection. in the procedural example, the return type was a string array, so I was bound to create and populate the array before returning from the function. That’s a point worth highlighting: Use iterators as return types where possible – they allow you to mix paradigms. Converting to IEnumerables is not enough. I could change the signature of TestProcedural to use iterators, but it would still have used Eager Evaluation.

The next thing I did was use the Map function to return a functional iterator rather than a concrete object graph as was done in the procedural example. I created Map here to demonstrate that there was no funny LINQ business going on in the background. In most cases I would use the Enumerable.Select() extension method from LINQ to do the same thing. Map is a function that is common in functional programming, it allows the lazy transformation of a stream or collection into something more useful. Map is the crux of the transformation – it allows you to insert a function into the simple process of iterating a collection.

Map is a Higher-Order Function, it accepts a function as a parameter and applies it to a collection on demand. Eventually you will need to deal with raw data – such as when you bind it to a GridView. Till that point you can hold off on committing resources that may not get used. Map is not the only HOF that we can use in this scenario. We’re repeatedly calling String.Replace in our functions. Perhaps we can generalise the idea of repeatedly calling a function with different parameters.

Func<T, T> On<T>(this Func<T, T> f, Func<T, T> g)
{
    return t => g(f(t));
}

This method encapsulates the idea of composing functions. I’m creating a function that returns the result of applying the inner function to an input value of type T, and then applying the outer function to the result. In normal mathematical notation this would be represented by the notation “g o f”, meaning g applied to f. Composition is a key way of building up more complex functions. It’s the linked list of the functional world – well it would be if the functional world were denied normal data structures, which it isn’t. :P

Notice that I’m using an extension method here, to make it nicer to deal with functions in your code. The next example is just a test method to introduce the new technique.

private static void TestComposition()
{
    var seq1 = new int[] { 1, 3, 5, 7, 11, 13, 19 };
    var g = ((Func<int, int>)(a => a + 2)).On(b => b * b).On(c => c + 1);
    foreach (var i in seq1.Map(g))
    {
      Debug.WriteLine(i);
    }
}

TestComposition uses the ‘On’ extension to compose functions into more complex functions. The actual function is not really that important, the point is that I packaged up a group of functions to be applied in order to an input value and then stored that function for later use. You might think that that’s no big deal, since the function could be achieved by even the most trivial procedure. But this is dynamically composing functions – think about what you could do with dynamically composable functions that don’t require complex control logic to make them work properly. Our next example shows how this can be applied to escaping strings for display on a web page.

void TestComposition2()
{
   var strTest = @”<html><body>hello world</body></html>”;
   string[][] replacements = new[]
     {
       new[]{“&”, “&amp;”},
       new[]{”  “, “&nbsp;&nbsp;”},
       new[]{“\t”, “&nbsp;&nbsp;”},
       new[]{“<”, “&lt;”},
       new[]{“>”, “&gt;”}
     };

  Func<string, string> f = x => x;
  foreach (string[] strings in replacements)
  {
     var s0 = strings[0];
     var s1 = strings[1];
     f = f.On(s => s.Replace(s0, s1));
  }

  Debug.WriteLine(f(strTest));
}

This procedure is again doing something quite significant – it’s taking a data structure and using that to guide the construction of a function that performs some data-driven processing on other data. Imagine that you took this from config data or a database somewhere. The function that gets composed is a fast, directly executable, encapsulated, interface free, type safe, dynamically generated unit of functionality. It has many of the benefits of the Gang Of Four Strategy Pattern[3].

The techniques I’ve shown in this post demonstrate some of the power of the functional paradigm. I described how you can combine higher order functions with iterators to give a form of lazy evaluation. I also showed how you can compose functions to build up fast customised functions that can be data-driven. I’ve also shown a simple implementation of the common Map method that allows a function to be applied to each of the elements of a collection. Lastly I provided a generic implementation of a function composition mechanism that allows you to build up complex functions within a domain.

Next time I’ll introduce the concept of closure, which we’ve seen here at work in the ‘On’ composition function.

Some references:

1. Wikipedia: Pure Functions

2. Wikipedia: Lazy Evaluation

3. Wikipedia: Strategy Pattern

The Future of Search

The future of search lies in finding ways to bypass search engines altogether. Where information exists locally to make sense of what you’re after it’ll be used to create better searches. Data mining will be used to form a picture of what results go together, and what meanings a user attaches to a word. Alternative, unambiguous entry points to the web, like spatial search, will provide alternative, quicker routes to data for compatible hand-held devices.

When the mass of data finally overwhelms the conventional search engines, something will have to give. If we want our data out there being used, we will have to transition our content over to semantic web technologies. These technologies provide solutions to developers seeking smarter software, and provide search engines with a way to give better results. New user interface idioms will have to be invented, but in the end the web will be a better place to live.

Crisis? What Crisis?

Search engine technology faces a pressing challenge. The challenge is to quickly yield relevant results from the avalanche of data we are producing. It’s not enough to find the data eventually - search engines must find the right data immediately. They must deduce what ‘right‘ means for a query, and that requires a level of understanding they don’t currently have. Every three years the data we produce doubles – in the next three years we will have three times as much data as we created during the whole of history up to 2007. By 2010 search engines will have to work three times harder than today, or be three times less discriminating. Surely, incremental fixes to search algorithms can’t keep up?

Perhaps ‘avalanche’ is not the right term – ‘lahar’ is more appropriate. A lahar is a colossal down-pouring of mud and water. Along with the water comes a variety of unwelcome dross. How do you distinguish worthwhile content from dross? Sometimes it is not easy. The continued success of Nigerian con-artists seems to show that many people naively believe everything they read. Deducing authority, accuracy and relevance is hard to automate. Search engines assume that if many others link to a page then it must be valuable and so rank it higher. That approach cannot hold up under the relentless lahar of dubious data.

When working on the web you have to filter data manually. As the web grows, your role in the search filtration process will grow all-consuming, unless a revolutionary change is brought about. Search engines that rely on reputation face the dilution of the very data they work with. Web users can’t keep up with (or find) the flow of worthy data – so the worthy data isn’t linked to and ranking algorithms the search engines use become useless. They are useless as a judge of truth. Tim Berners-Lee has identified ‘trust’ assessment as a major component of his vision for Web 3.0. It’s hard to see how current search engines can contribute to his vision.

The difficulty with performing searches based on natural language search terms is that natural language allows many meanings for words, homonyms, synonyms and a wealth of nuance for each individual word in a search. There are twelve synonyms of the word ‘search’. Can they all be used interchangeably? The answer is no, they all mean something slightly different, and so probably should return a different result set. The space of all possible meanings grows enormously as the complexity of the search increases. The average relevance of each search result goes down relative to the complexity of a query. Automatically clarifying a search query is a challenge that will not be solved by 2010, but just stating the problem points to some interesting solutions based on present day technologies.

Why is search important?

Consider the trend – we are conducting more and more of our professional and private lives online. We have the communications protocols to allow any connected individual to converse with any other. We have the means to share our thoughts with everyone else in the world that cares to know them. As a computer programmer, I spend a significant portion of my working life searching for information relevant to the problem I am trying to solve. Things that prevent me from achieving my goals are therefore a direct cost to my clients and to me. They reduce my productivity and increase the cost to my clients of getting new software developed. It is hard to underestimate the cost to an economy of search engine inefficiencies.

Inefficient search technologies engender cognitive dissonance – just at the point that we need information we are forced to context switch into a battle of wits with a search engine. It prevents us from entering the flow. The knock-on effect of stopping us from being able to smoothly transition between the data we have and the data we need is another economic cost. It reduces our job satisfaction, forces us to perform below peak and frustrates us. Our morale is weakened by the poor search technologies we have to work with!

Semantic Search

So much for the problems. What of the solutions? How can search be improved to lessen or eliminate the costs to our productivity and fun? Well the solution exists in three parts – preparing a query, performing it and presenting the results. In addition we need to solve the problem of how to prepare (or reveal) data so that search engines can query it. The issue of data and query formats falls broadly within the category of “semantic web” technologies. Next generation data formats are undoubtedly are a necessary component of future search, but they are not sufficient – semantic web query languages are beyond the reach of everyday users.

I have endeavored to bring semantic search into the domain of the commercial programmer with my LinqToRdf project. What I have done, though, is to replace one formal language for another. For semantic search to become a commercial reality on the desktop, we must devise user interfaces that hide or simplify the searching experience. Google already provides ways to build complex queries that filter my results better. Yet, for most searches (of which I must perform dozens every day) I stick to keyword searches. Most users will not ever use the advanced features of an engine like Google. To reach them, we need a way to produce semantically meaningful queries to match the semantically meaningful data that will be out there on the web. How can we do that? I think the answer lies in context.

Assuming that the search engine doesn’t understand a word that you say when you type in a sequence of search keywords, it makes sense that the relevance of the results will decline as the number of possible misinterpretations increases. The English language, or any other natural language for that matter, is full of so many nuances and alternate meanings for each significant word that the space of possible interpretations for each word increases dramatically. As a consequence, the chances of a search engine being able to deliver relevant results are bound to decline at a similar rate. In fact, the problem is worse than that. The rate of decline of relevance is going to be the product of the probabilities of getting the sense of each word right for each word in the query. If there are five ‘senses’ to each of two words in a search string, then the chances of you getting the right result are going to be one in twenty five. A long shot. There are various heuristics that can be used to shorten the odds. But in the end, you’re never going to consistently get decent results out of a machine that doesn’t understand you or the documents that it has indexed. Enter the semantic web.

The semantic web is all about providing a means to be completely unambiguous about what you are talking about. The markup languages devised by the W3C and others are aimed at allowing content writers to provide clear and unambiguous meanings to terms. Specifically they allow one to distinguish between the uses of a word by attaching a URL to the sense, which being unique gives a unique sense to the search term. Obviously, if terms are unique then the queries against them can also be unique by referencing the term of interest. The adoption rates for semantic web technologies has not so far been meteoric, and I think that that is as much to do with tool support as with any innate pessimism about reaching a consensus about how to model your data.

I think that HTML would have taken much longer to reach saturation point if there had been no WYSIWYG editors available. What is needed to make data on the web (and the queries about them) more meaningful is a revolutionary new interface to the semantic web schemas that are available for querying.

Perhaps what is required is for a predictive lookup extension to the good old textbox that will offer you an intellisense type lookup so that as you type a term into the textbox it looks up the available senses of the term from the search engine. Examples of this are on display in the text entry boxes of Freebase. These senses can be derived from ontologies known to the search engine. If the context of what you’re working on allows it, the right ontology type can be preselected as you perform a search. Just as context provides the way to isolate the specific meaning of a word in a sentence or a query, so too does it provide a way for future search tools to generate intelligible queries.

The price of being able to make meaningful queries and get meaningful answers is that someone somewhere has to compose their data in a meaningful way. Meaning is a tricky word – here I mean complex structured data designed to adequately describe some domain. In other words someone has to write and populate an ontology for each domain that the users want to ask questions about. It’s painstaking, specialized work that not just anyone can do. Not even a computer scientist – whilst they may have the required analysis and design skills, they don’t have the domain knowledge or the data. Hence the pace of forward progress has been slow as those with the knowledge are unaware of the value of an ontology or the methods to produce and exploit it.

Compare this to the modus operandi of the big search companies. Without fail they all use some variant on full-text indexing. It should be fairly clear why as well – they require no understanding of the domain of a document, nor do their users get any guarantees of relevance in the result sets. Users aren’t even particularly surprised when they get spurious results. It just goes with the territory.

Companies that hope or expect to maintain a monopoly in the search space have to use a mechanism that provides broad coverage across any domain, even if that breadth is at the expense of accuracy or meaningfulness. Clearly, the semantic web and monolithic search engines are incompatible. Not surprising then that for the likes of Microsoft and Google the semantic web is not on their radar. They can’t do it. They haven’t got the skills, time, money or incentive to do it.

If the semantic web is to get much of a toehold in the world of search engines it is going to have to be as a confederation of small search engines produced by specialized groups that are formed and run by domain experts. In a few short years Wikipedia has come to rival the likes of Encyclopedia Britannica. The value of its crowd-sourced content is obvious. This amazing resource came about through the distributed efforts of thousands across the web, with no thought of profit. Likewise, it will be a democratized, decentralized, grass-roots movement that will yield up the meaningful information we all need to get a better web experience.

One promising direction in structured data is being taken by Metaweb (founded by Danny Hillis, of Thinking Machines fame) whose Freebase collaborative database is just becoming available to the general public. It uses types in a way similar to OWL, and has a very simple and intuitive search interface. It provides a straightforward means for creating new types and adding data based on them – critical for its data to flourish. My own experience in creating types to describe Algorithms and Data Structures was encouraging. I seeded the database with a few entries for the most common examples and was able to link those entries to textual descriptions found in Wikipedia easily. Within a few weeks the entries had been fleshed out and many more obscure algorithms had been added by others. If this group of Freebase types continues to gain attention, it may become a definitive list of algorithms on the web acting as a main entry point for those wishing to research potential solutions to their programming problems.

Perhaps islands of high quality structured information may become the favored web entry points. Users may trace links out of an entry in Freebase to areas where HTML anchors are all that they need to follow. Perhaps structured data can be used to disambiguate queries on conventional search engines – There may be many homonymous entries in freebase, and reference to the specific freebase type may help the search engine to filter its own results more effectively.

This crowd-sourced approach to creating structured data on the web is obviously going to help in the creation of isolated areas of excellence like Freebase and Wikipedia. Can it help to displace the mass of unstructured data on the web? I don’t think so. Our exponential outpour of data will mostly be in the form of video and audio. Without sophisticated image and voice recognition systems, search engines of any type will not be able to index them. Metadata in RDF or OWL format is being used to describe the binary data out there on the web, but that is equivalent to only indexing the titles of documents while ignoring their content.

Microsoft and Google both seem ambivalent towards the Semantic Web. Unofficially, they claim that they don’t believe in the semantic web, but in recent months Microsoft has released media management products that are based on the W3C’s RDF and OWL standards. They claim that semantic web technologies overcome problems they could not solve using conventional techniques.

Embedded Search Tools

At the moment search engines are stand-alone. They are cut-off from the tools we are using them to gather data for. In my case, I spend most of my day inside a programmer’s IDE (integrated development environment) – like a glorified text editor for source code. When I’m in there I’m constantly faced with little puzzles that crop up to do with the program I’m writing. When I need to get a piece of reference information I have to come out of the IDE and call up a browser and perform the search in that before switching back to what I was doing. That context switching is distracting and I’d prefer the whole experience to be smoother. In addition to that, there is a whole bunch of contextual information that exists inside of the application that is (potentially) germane to the search I’m doing.

Embedding the internet search facilities inside of general purpose applications such as my IDE provides a wealth of information to the search engine with which to automatically filter the results or to target the search. This extra context enhances the accuracy of results but in the final accounting it is just extending the lifespan of the free text search paradigm, without properly bridging the gap between meaningful and meaningless queries and data. To go beyond the limiting technologies in use at the moment we will have to search for new representations of data and use those as the context in which we ask our questions.

For example, the current documentation that we write for our source code is derived from HTML and contains links to other classes and systems that we use. If the documentation that came with code was derived from a ontology, and it linked to other ontologies describing the systems our code uses and interacts with, then the documentation could be mined as a source of context information with which to produce a semantic search query.

Imagine that you are an engineer and are producing a quote for one of your clients. Imagine that you needed to put the price for a gross of sprockets into the invoice. To do that you need the price from one of your supplier’s online catalogues. The fact that your suppliers are referenced in previous invoices should allow an embedded search engine to query the supplier’s catalogue (an ontology) using semantic web technologies. The context is derived from a mix of internal knowledge of what task you are performing, and what pattern of relationships you have with suppliers that helps you to fulfil orders.

Embedded search engines will rely on the pervasiveness of structured data both inside and outside of the enterprise. Without them, they will have just as hard a time making sense of a query as any general purpose search engine.

Statistical Models and Search

A viable short term solution to the problem of injecting meaning into a search query and matching it with the meaning in a document is to exploit data mining. Data mining can be used to help make predictions of the meaning that is likely to be attached to a term by a user. In just the same way that Amazon looks at what books a user has, and what books they click on, and how they purchase books, so too can a modern search engine gather data about what you search for, and what results you clicked on from what was offered. In essence, there is very little difference between the task performed by the Amazon web site and that performed by an Internet search engine.

A store like Amazon is a vast collection of data items that can be searched against. Each user that searches has a particular ‘best result’ in mind, and when they find it they select it (by buying it, viewing it or adding it to a wish list or shopping cart). Similarly a search engine is used in the same way – in this case the results are given away, but the dynamics are the same – users determine from the metadata provided whether a given result is a close enough match for what they were after. If it is, then they will click on it. Each time a user clicks on a link, they are providing some information back to the search engine that can be used to tune the user’s search in future. Search engines can use models that can be used to prioritize the results that are offered next time that user (or one like them) searches.

Needless to say, the data generated in this kind of system would be truly gigantic. As the internet grows, this metadata will grow at an exponential rate. The only way to avoid this kind of exponential curve is to increase the information density of the data on the net, or to increase the hit rate and the efficiency of the process of finding results. Again, the practicalities of dealing with data sets as big as the web point to the need for some fundamental revolution in data storage and search.

Geospatial Search

When you perform a search on the internet you are diving headfirst into a wide sea of data. The data you are after is like an island in that sea, but you might dive in a long way from where it is. The place where you dive in is critical to your success, and geospatial search may provide the solution by providing a natural and unambiguous point of entry into the data space. Much of the data that we place on the web has a natural home location, and that can be used to bring relevant data to the fore.

Rolf Landauer once said ‘information is physical’. Information cannot be had about anything other than physical things, and it cannot be represented on anything other than physical objects. Information itself is the structure of the universe. Although we seldom think of it that way – search technology is really an extension of the senses. It allows us to peer into other places and times. All information is tied to specific geospatial and temporal locations. Mostly, we are not in the habit of taking much notice of these ambient contexts that are often the only concrete links between facts in the world.

Some years ago, I was sat daydreaming in a coffee shop near where I lived. While looking out of the window at some old shops across the road I wondered what would happen if I was able to call up the information related to those shops just because I was sat near them. Each shop was inhabited by a business, each business was but one of a long line of inhabitants, and each inhabitant would have a huge network of public and private relationships that in some sense defines them as an entity.

I imagined that I had a GPS enabled device that made use of a protocol similar to the DNS domain naming service. DNS divides web sites into a partly spatial and partly logical or commercial hierarchy. The process of searching for the web site of one of those businesses would involve sending a query up the line from the domain I was on, till the query got to a DNS server that knew how to route the query to the DNS servers for the host of the website I was after. A spatial query engine would add another level of searching over the top of DNS, mapping spatio-temporal coordinates onto the conventional DNS naming system.

I wondered whether a means of search based on geographical location might not be more sensible. Imagine that the search results were prioritized according to physical and temporal distance from where and when I was. This would be an alternative name resolution strategy that would give me the locations on record for the shops across the road. From there I could navigate the local patterns of the publicly accessible relations for the entities that I had found.

If I requested the spatial search, the first result on the list should be the coffee shop I was in (plus any other companies or individuals that were in other floors of the building). Following that would come any entities nearby, in order of distance. Perhaps I could point my handheld device at the shop (say it’s an art gallery) and a digital compass might deduce the direction I was pointing in and filter the results for me to just that sector I was pointing in up a certain distance. The gallery is a business, and as such it is (in the UK at least) obliged by law to file its accounts with the government organization called Companies House. These links should be accessible, since the gallery would (by virtue of renting space in the real world) have a claim on the virtual space of the search engine. All that is required to get the company accounts would be for the company accounts to be online and indexed to a specific company that has a registry entry in the spatial search engine.

I then imagined that getting the data for the gallery opened up a path to a whole bunch of other information that flowed out into other sites. Perhaps images were available for all the works of art on sale. Perhaps from there I could navigate to the sites of the artists that I liked. Perhaps I liked one of the pictures the gallery had on display. Perhaps I finished my coffee, and crossed the road and took a look? This kind of entry point might make a difference to the number of passers-by that went into the gallery. The spatial search entry point to the web gives way to conventional web browsing, without any of the intermediate head scratching.

This spatial search can of course tie in both with the new generation of GPS enabled multi-purpose mobile devices that are now coming onto the market. It could also dovetail nicely with existing spatial search engines such as Google Maps or Live Earth. Spatial search creates a direct and meaningful structure in the virtual space that easily correlates to our real space. Spatial addresses may well provide a more powerful entry-point into the mass of data that we are trying to negotiate than a mere keyword search. The problem with text indexing and keyword searches is that they carry too many possible meanings for what you were after, and the relevance of the results decreases in proportion to the number of possible interpretations. GPS at least is precise and unambiguous, and soon to be available in every phone…

Object Modeling is Vocabulary Design

Andrew Cantos raised some interesting philosophical points in reply to my partially tongue in cheek post The Great Domain Model Debate – Solved the other day. As ever, my short reply turned into a blog post and this is it. Andrew’s point was that there is a metaphorical link between objects in a domain model and elementary particles in some physical system. The ability of these elements to take part in the wider system is often a function of their sheer simplicity rather than being loaded with complex properties. He use the example of Oxygen, as an example of something that can take part in many reactions, but which does not define or characterize those reactions. I extended the metaphor to observe that the same holds true when comparing Anemic Domain Models with their Rich brethren.

I like his metaphor. The metaphor I tend use when I think about this issue is related to human languages. Words are poor carriers of meaning on their own, in the same sense that rich objects are poor carriers of business functionality. A word’s specific value comes within the dynamic context of a sentence. I.e it’s exact meaning and value can only be resolved when composed together in a richer context.

Likewise, the same happens in an OO system – the analogue of the ‘sentence’ here is the thread of execution, or the transaction script or whathaveyou. they give meaning to the data carried by the anemic object. Without that context the object is worthless. What a RDM seeks to do is carry with the object the full set of possible contexts. It also seeks to restrict that set of contexts to a manageable set.

I can sympathize with that aim – ADM’s do little to guarantee that they get used right. RDMs do. However, I think that as with a science, an enterprise system needs to have a commonly agreed shared vocabulary. With that, a greater richness of communication becomes possible. If however, you were restricted in the ways you could use these words, you may have greater precision, but communication would become a chore, and you probably wouldn’t bother.

You can extend this whole ‘enterprise vocabulary’ metaphor even further. If you look at a typical, poorly coded or governed system, you will often see a situation where there are a multitude of little DTOs all of which contain roughly the same data, but just those fields that are needed to service a given page or screen. This situation is analogous to the situation in an immature science where there are many words freighted with slightly different meanings. Confusion results when a speaker intends different things from the listener. So too in software, where the lack of a commonly agreed object model serves to add confusion to the development process, and to increase he likelihood of errors creeping into a system.

What does this imply? It seems to me that the right approach (assuming the metaphor holds true) is that there ought to be a  well defined, definitive and widely shared object model within an organization. All system should use it, and the organization should mint new classes with great care and forethought. This of course ties in with the efforts of various groups in the Semantic Web area, who are attempting to do just that in domains a widely flung as life sciences and disaster relief. The fact that the efforts are inter-organizational means that there will be less tolerance for poorly advised ‘pragmatism’.

Which can only be a good thing in the long run. Right?

LinqToRdf – Designing a Query Provider

When I started implementing the SPARQL support in LINQ to RDF, I decided that I needed to implement as much of the standard query operators as possible. SPARQL is a very rich query language that bears a passing syntactical resemblance to SQL. It didn’t seem unreasonable to expect most of the operators of LINQ to SQL to be usable with SPARQL. With this post I thought I’d pass on a few design notes that I have gathered during the work to date on the SPARQL query provider.

The different components of a LINQ query get converted into successive calls to your query class. My class is called SparqlQuery<T>. If you had a query like this:

[TestMethod]
public void SparqlQueryOrdered()
{
    string urlToRemoteSparqlEndpoint = @"http://someUri";
    TripleStore ts = new TripleStore();
    ts.EndpointUri = urlToRemoteSparqlEndpoint;
    ts.QueryType = QueryType.RemoteSparqlStore;
    IRdfQuery<Track> qry = new RDF(ts).ForType<Track>(); 
    var q = from t in qry
        where t.Year == 2006 &&
        t.GenreName == "History 5 | Fall 2006 | UC Berkeley" 
        orderby t.FileLocation
        select new {t.Title, t.FileLocation};
    foreach(var track in q){
        Trace.WriteLine(track.Title + ": " + track.FileLocation);
    }        
}

This would roughly equate to the following code, using the extension method syntax:

[TestMethod]
public void SparqlQueryOrdered()
{
    ParameterExpression t;
    string urlToRemoteSparqlEndpoint = http://someUri;
    TripleStore ts = new TripleStore();
    ts.EndpointUri = urlToRemoteSparqlEndpoint;
    ts.QueryType = QueryType.RemoteSparqlStore;
    var q = new RDF(ts).ForType<Track>()
        .Where<Track>(/*create expression tree*/)
        .OrderBy<Track, string>(/*create  expression tree*/)
        .Select(/*create  expression tree*/;
    foreach (var track in q)
    {
        Trace.WriteLine(track.Title + ": " + track.FileLocation);
    }
}

The bold red method calls are the succession of calls to the query’s CreateQuery method. That might not be immediately obvious from looking at the code. In fact it’s downright unobvious! There’s compiler magic going on in this, that you don’t see. Anyway, what happens is that you end up getting a succession of calls into your IQueryable<T>.CreateQuery() method. And that’s what we are mostly concerned about when creating a query provider.

The last I blogged about the CreateQuery method I gave a method with a switch statement that identified the origin of the call (i.e. Where, OrderBy, Select or whatever) and dispatched the call off to be immediately processed. I now realise that that is not the best way to do it. If I try to create my SPARQL query in one pass, I will not have much of a chance to perform optimization or disambiguation. If I generate the projection before I know what fields were important, I would probably end up having to get everything back and filter on receipt of all the data. I think Bart De Smet was faced with that problem with LINQ to LDAP (LDAP doesn’t support projections) so he had to get everything back. SPARQL does support projections, and that means I can’t generate the SPARQL query string till after I know what to get back from the Select call.

My solution to this is to keep all the calls into the CreateQuery in a Hashtable so that I can use them all together in the call to GetEnumerator. That gives me the chance to do any amount of analysis on the expression trees I’ve got stored, before I convert them into a SPARQL query. The CreateQuery method now looks like this:

protected Dictionary<string, MethodCallExpression> expressions;
public IQueryable<S> CreateQuery<S>(Expression expression)
{
    SparqlQuery<S> newQuery = CloneQueryForNewType<S>();

    MethodCallExpression call = expression as MethodCallExpression;
    if (call != null)
    {
        Expressions[call.Method.Name] = call;
    }
    return newQuery;
}

This approach helps because it makes it much simpler to start adding the other query operators.

I also been doing a fair bit of tidying up as I go along. My GetEnumerator method now reflects the major grammatical components of the SPARQL spec for SELECT queries.

private void CreateQuery(StringBuilder sb)
{
    if(Expressions.ContainsKey("Where"))
    {
        // first parse the where expression to get the list of parameters to/from the query.
        StringBuilder sbTmp = new StringBuilder();
        ParseQuery(Expressions["Where"].Parameters[1], sbTmp);
        //sbTmp now contains the FILTER clause so save it somewhere useful.
        Query = sbTmp.ToString();
        // now store the parameters where they can be used later on.
        queryGraphParameters.AddAll(Parser.Parameters);
    }
    CreateProlog(sb);
    CreateDataSetClause(sb);
    CreateProjection(sb);
    CreateWhereClause(sb);
    CreateSolutionModifier(sb);
}

The If clause checks whether the query had a where clause. If it did, it parses it, creating the FILTER expression, and in the process gathering some valuable information about what members from T were referenced in the where clause. This information is useful for other tasks, so it gets done in advance of creating the Where clause.

Creating A LINQ Query Provider

As promised last time, I have extended the query mechanism of my little application with a LINQ Query Provider. I based my initial design on the method published by Bart De Smet, but have extended that framework, cleaned it up and tied it in with the original object deserialiser for SemWeb (a semantic web library by Joshua Tauberer).

In this post I’ll give you some edited highlights of what was involved. You may recal that last post I provided some unit tests that i was working with. For the sake of initial simplicity (and to make it easy to produce queries with SemWeb’s GraphMatch algorithm) I restricted my query language to make use of Conjunction, and Equality. here’s the unit test that I worked with to drive the development process. What I produced last time was a simple scanner that went through my podcasts extracting metadata and creating objects of type Track.

[TestMethod]
public void QueryWithProjection()
{
CreateMemoryStore();
IRdfQuery<Track> qry = new RdfContext(store).ForType<Track>();
var q = from t in qry
where t.Year == 2006 &&
t.GenreName == "History 5 | Fall 2006 | UC Berkeley"
select new {t.Title, t.FileLocation};
foreach(var track in q){
Trace.WriteLine(track.Title + ": " + track.FileLocation);
}
}

This method queries the Tracks collection in an in-memory triple store loaded from a file in N3 format. It searches for any UC Berkley pod-casts produced in 2006, and performs a projection to create a new anonymous type containing the title and location of the files.

I took a leaf from the book of LINQ to SQL to crate the query object. In LINQ to SQL you indicate the type you are working with using a Table<T> class. In my query context class, you identify the type you are working with using a ForType<T>() method. this method instantiates a query object for you, and (in future) will act as an object registry to keep track of object updates.

The RDFContext class is very simple:

public class RdfContext : IRdfContext
{
public Store Store
{
get { return store; }
set { store = value; }
}
protected Store store;
public RdfContext(Store store)
{
this.store = store;
}
public void AcceptChanges()
{
throw new NotImplementedException();
}
public IRdfQuery<T> ForType<T>()
{
return new RdfN3Query<T>(store);
}
}

As you can see, it is pretty bare at the moment. It maintains a reference to the store, and instantiates query objects for you. But in future this would be the place to create transactional support, and perhaps maintain connections to triple stores. By and large, though, this class will be pretty simple in comparison to the query class that is to follow.

I won’t repeat all of what Bart De Smet said in his excellent series of articles on the production of LINQ to LDAP. I’ll confine myself to this implementation, and how it works. So we have to start by creating our Query Object:

public class RdfN3Query<T> : IRdfQuery<T>
{
public RdfN3Query(Store store)
{
this.store = store;
this.originalType = typeof (T);
parser = new ExpressionNodeParser<T>();
}

First it stores a reference to the triple store for later use. In a more real world implementation this might be a URL or connection string. But for the sake of this implementation, we can be happy with the Memory Store that is used in the unit test. next we keep a record of the original type that is being queried against. this is important because later on you may also be dealing with a new anonymous type that will be created by the projection. This will not have any of the Owl*Attribute classes with which to work out URLs for properties and to perform deserialisation.

The two most important methods in IQueryable<T> are CreateQuery and GetEnumerable. CreateQuery is the place where LINQ feeds you the expression tree that it has built from your initial query. You must parse this expression tree and store the resultant query somewhere for later use. I created a string called query to keep that in, and created a class called ExpressionNodeParser to walk the expression tree to build tyhe query string. This is equivalent to the stage where the SQL SELECT query gets created in DLINQ. My CreateQuery looks like this:

public IQueryable<TElement> CreateQuery<TElement>(Expression expression)
{
RdfN3Query<TElement> newQuery = new RdfN3Query<TElement>(store);
newQuery.OriginalType = originalType;
newQuery.Project = project;
newQuery.Properties = properties;
newQuery.Query = Query;
newQuery.Logger = logger;
newQuery.Parser = new ExpressionNodeParser<TElement>(new
StringBuilder(parser.StringBuilder.ToString()));
MethodCallExpression call = expression as MethodCallExpression;
if (call != null)
{
switch (call.Method.Name)
{
case "Where":
Log("Processing the where expression");
newQuery.BuildQuery(call.Parameters[1]);
break;
case "Select":
Log("Processing the select expression");
newQuery.BuildProjection(call);
break;
}
}
return newQuery;
}

You create new query because you may be doing a projection, in which case the type you are enumerating over will not be the original type that you put into ForType<T>(). Instead it may be the anonymous type from the projection. You transfer the vital information over to the new Query object, and then handle the expression that has been passed in. I am handling two methods here: Where and Select. There are others I could handle, such as OrderBy or Take, but that will have to be for a future post.

Where is the part where the expression representing the query is passed in. Select is passed the tree representing the projection (if there is one). The work is passed off to BuildQuery and BuildProjection accordingly. these names were gratefully stolen from LINQ to LDAP.

BuildQuery in LINQ to LDAP is a fairly complicated affair, but in LINQ to RDF I have paired it right downb to the bone.

private void BuildQuery(Expression q)
{
StringBuilder sb = new StringBuilder();
ParseQuery(q, sb);
Query = Parser.StringBuilder.ToString();
Trace.WriteLine(Query);
}

We create a StringBuilder that can be passed down into the recursive descent tree walker to gather the fragments of the query as each expression gets parsed. the result is then stored in the Query property of the Query object. BuildProjection looks like this:

private void BuildProjection(Expression expression)
{
LambdaExpression le = ((MethodCallExpression)expression).Parameters[1] as
LambdaExpression;
if (le == null)
throw new ApplicationException("Incompatible expression type found when building a projection");
project = le.Compile();
MemberInitExpression mie = le.Body as MemberInitExpression;
if (mie != null)
foreach (Binding b in mie.Bindings)
FindProperties(b);
else
foreach (PropertyInfo i in originalType.GetProperties())
properties.Add(i.Name, null);
}

Much of it is taken directly from LINQ to LDAP. I have adapted it slightly because I am targeting the May 2007 CTP of LINQ. I’ve done this only because I have to use VS 2005 during the day, so I can’t use the March 2007 version of Orcas.

ParseQuery is used by BuildQuery to handle the walking of the expression tree. Again that is very simple since most of the work is now done in ExpressionNodeParser. It looks like this:

private void ParseQuery(Expression expression, StringBuilder sb)
{
Parser.Dispatch(expression);
}

Parser.Dispatch is a gigantic switch statement that passes off the expression tree to handler methods:

public void Dispatch(Expression expression)
{
switch (expression.NodeType)
{
case ExpressionType.Add:
Add(expression);
break;
case ExpressionType.AddChecked:
AddChecked(expression);
break;
case ExpressionType.And:
And(expression);
break;
case ExpressionType.AndAlso:
AndAlso(expression);
//...

Each handler method then handles the root of the expression tree, breaking it up and passing on what it can’t handle itself. For example, the method AndAlso just takes the left and right side of the operator and recursively dispatches them:

public void AndAlso(Expression e)
{
BinaryExpression be = e as BinaryExpression;
if (be != null)
{
Dispatch(be.Left);
Dispatch(be.Right);
}
}

The equality operator is the only operator that currently gets any special effort.

public void EQ(Expression e)
{
BinaryExpression be = e as BinaryExpression;
if (be != null)
{
MemberExpression me = be.Left as MemberExpression;
ConstantExpression ce = be.Right as ConstantExpression;
QueryAppend(tripleFormatStringLiteral,
InstancePlaceholderName,
OwlClassSupertype.GetPropertyUri(typeof(T),
me.Member.Name),
ce.Value.ToString());
}
MethodCallExpression mce = e as MethodCallExpression;
if (mce != null && mce.Method.Name == "op_Equality")
{
MemberExpression me = mce.Parameters[0] as MemberExpression;
ConstantExpression ce = mce.Parameters[1] as ConstantExpression;
QueryAppend(tripleFormatStringLiteral,
InstancePlaceholderName,
OwlClassSupertype.GetPropertyUri(typeof(T),
me.Member.Name),
ce.Value.ToString());
}
}

The equality expression can be formed either through the use of a binary expression with NodeType.EQ or as a MethodCallExpression on op_Equality for type string. If the handler for the MethodCallExpression spots op_Equality it passes the expression off to the EQ method for it to render instead. EQ therefore needs to spot which type of Node it’s dealing with to know how to get the left and right sides of the operation. In a BinaryExpression there are Right and Left properties, whereas in a MethodCallExpression these will be found in a Parameters collection. In our example they get the same treatment.

You’ll note that we assume that the left operand is a MemberExpression and the right is a ConstantExpression. That allows us to form clauses like this:

where t.Year == 2006

but it would fail on all of the following:

where t.Name.ToUpper() == "SOME STRING"
where t.Name == t.Other
where t.Year.ToString() == "2006"

Each of these cases will have to be handled individually, so the number of cases we need to handle can grow. As Bart De Smet pointed out, some of the operations might have to be performed after retrieval of the results since semantic web query languages are unlikely to have complex string manipulation and arithmetic functions. Or at least, not yet.

The QueryAppend forms an N3 Triple out of its parameters and appends it to the StringBuilder that was passed to the Parser initially. At the end of the recursive tree walk, this string builder is harvested and preprocessed to make it ready to pass to the triple store. In my previous post I described an ObjectDeserialisationsSink that was passed to SemWeb during the query process to harvest the results. This has been reused to gather the results of the query from within our query.

I mentioned earlier that the GetEnumerator method was important to IQueryable. An IQueryable is a class that can defer execution of its query till someone attempts to enumerate its results. Since that’s done using GetEnumerator the query must be performed in GetEnumerator. My implementation of GetEnumerator looks like this:

IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
if (result != null)
return result.GetEnumerator();
query = ConstructQuery();
PrepareQueryAndConnection();
PresentQuery(query);
return result.GetEnumerator();
}

result is the List<TElement> variable where I cache the results for later use. What that means is that the query only gets run once. Next time the GetEnumerator gets called, result is returned directly. This reduces the cost of repeatedly enumerating the same query. Currently the methods ConstructQuery, PrepareQueryAndConnection, and PresentQuery are all fairly simple affairs that exist more as placeholders so that I can reuse much of this code for a LINQ to SPARQL implementation that is to follow.

As you’ve probably worked out, there is a huge amount of detail that has to be attended to, but the basic concepts are simple. the reason why more people haven’t written LINQ query providers before now is simply that fact that there is no documentation about how to do it. When you try though, you may find it easier than you thought.

There is a great deal more to do to LINQ to RDF before something it is ready for production use, but as a proof of concept that semantic web technologies can be brought into the mainstream it serves well. Thereason why we use ORM systems such as LINQ to SQL is to help us overcome the Impedance Mismatch that exists between the object and relational domains. An equally large mismatch exists between the Object and Semantic domains. tools like LINQ to RDF will have to overcome the mismatch in order for them to be used outside of basic domain models.

Using RDF and C# to create an MP3 Manager – Part 3

Last time I hurriedly showed you how you can perform the next step of converting a triple store into an ORM system of sorts. The purpose of all this activity, and the reason I left off blogging about LINQ was that I am working on a system to allow me to use LINQ with a triple store and reasoner. The benefit of doing so is that we should have a lot of the benefits of the modern relational world with the power and connectivity of the new world of the semantic web. In this post I shall outline the steps required to start working with LINQ to RDF (as I’ve chosen to call it through lack of imagination).

I’ve been using test driven development throughout, so I already have a few ‘integration’ unit tests to show you:

[TestMethod]
public void Query()
{
  string urlToRemoteSparqlEndpoint = "http://localhost/MyMusicService/SparqlQuery.ashx";
  RdfContext<Track> ctx = new RdfSparqlContext<Track>(urlToRemoteSparqlEndpoint);
  var titles = from t in ctx
    where t.Year > 1998 &&
    t.GenreName == "Ambient Techno" ||
    t.GenreName == "Chillout"
    select t.Title;
  foreach(string title in titles)
    Console.WriteLine(title);
}

In English, this means that rather than manipulating a local triple store, I want the RdfSparqlContext to compose a SPARQL query and present it to the query endpoint found at location urlToRemoteSparqlEndpoint. I then want it to deserialise the results returned and store them in titles. This is a nice mixture of new features from .NET 3.5, combined with some of the features I’ve already developed for object deserialisation.

With the example above I am concerned more with the querying aspect of LINQ (it being a query language n’ all!) but that is not much use to me in the world of transactional web development where I find myself mired for the moment, so we need full CRUD behaviour from this system. Here’s a unit test for object update.

[TestMethod]
public void Update()
{
  string urlToRemoteSparqlEndpoint = @"http://localhost/MyMusicService/SparqlQuery.ashx";
  RdfContext<Track> ctx = new RdfSparqlContext<Track>(urlToRemoteSparqlEndpoint);
  var q = from t in ctx
    where t.Year > 1998 &&
    t.GenreName == "Ambient Techno" ||
    t.GenreName == "Chillout"
    select t;
  foreach (Track t in q)
    t.Rating = 5;
  ctx.AcceptChanges();
}

Here, I’m getting a bunch of objects back from the triple store, modifying their Rating property and then asking for those changes to be stored. This follows the usage patterns for LINQ to SQL.

To satisfy these unit tests (plus ones for the rest of the CRUD behaviour and with support for N3, in-memory and RDBMS based local triple stores is what I’m aiming to complete – eventually. Here’s the general scheme for implementing querying using LINQ. It assumes that RDF data structures are taken unmodified from the SemWeb library.

  • Create a Query Object (I guess this would be our RdfContext class above)
  • Implement IQueryable<T> on it.
  • When the enumerable is requested, convert the stored expression tree into the target language
  • Present it to whatever store or endpoint is available,
  • Deserialise the results into objects
  • Yield the results (probably as they are being deserialised)

This is a very broad outline of the tasks. I’ll explain in a lot more depth in subsequent posts, as I tackle each step.

Using RDF and C# to create an MP3 Manager – Part 2

I’ve been off the air for a week or two – I’ve been hard at work on the final stages of a project at work that will go live next week. I’ve been on this project for almost 6 months now, and next week I’ll get a well earned rest. What that means is I get to do some dedicated Professional Development (PD) time which I have opted to devote to Semantic Web technologies. That was a hard sell to the folks at Readify, what with Silverlight and .NET 3 there to be worked on. I think I persuaded them that consultancies without SW skills will be at a disadvantage in years to come.

Anyway, enough of that – onto the subject of the post, which is the next stage of my mini-series about using semantic web technologies in the production of a little MP3 file manager.

At the end of the last post we had a simple mechanism for serialising objects into a triple store, with a set of services for extracting relevant information out of an object, and to tie it to predicates defined in on ontology. In this post I will show you the other end of the process. We need to be able to query against the triple store and get a collection of objects back.

The query I’ll show you is very simple, since the main task for this post is object deserialisation, once we can shuttle objects in and out of the triple store then we can focus on beefing up the query process.

Querying the triple store

For this example I just got a list of artists for the user and allowed them to select one. That artist was then fed into a graph match query in SemWeb, to bring back all of the tracks whose artist matches the one chosen.

The query works in the usual way – get a connection to the data store, create a query, present it and reap the result for conversion to objects:

private IList<Track> DoSearch()
{
  MemoryStore ms = Store.TripleStore;
  ObjectDeserialiserQuerySink<Track> sink = new ObjectDeserialiserQuerySink<Track>();
  string qry = CreateQueryForArtist(artists[0].Trim());
  Query query = new GraphMatch(new N3Reader(new StringReader(qry)));
  query.Run(ms, sink);
  return tracksFound = sink.DeserialisedObjects;
}

We’ll get on the the ObjectDeserialiserQuerySink in a short while. The process of creating the query is really easy, given the simple reflection facilities I created last time. I’m using the N3 format for the queries, for the sake of simplicity – we could just as easily used SPARQL. We start with a prefix string to give us a namespace to work with, we then enumerate the persistent properties of the Track type. For each property we then insert a triple meaning “whatever track is select – get its property as well”. Lastly, we add the artist name ass a known fact, allowing us to specify exactly what tracks we were talking about.

private static string CreateQueryForArtist(string artistName)
{
  string queryFmt = "@prefix m: <http: aabs.purl.org/ontologies/2007/04/music#> .\n";
  foreach (PropertyInfo info in OwlClassSupertype.GetAllPersistentProperties(typeof(Track)))
  {
    queryFmt += string.Format("?track <{0}> ?{1} .\n", OwlClassSupertype.GetPropertyUri(typeof(Track), info.Name), info.Name);
  }
  queryFmt += string.Format("?track <{0}> \"{1}\" .\n", OwlClassSupertype.GetPropertyUri(typeof(Track), "ArtistName"), artistName);
  return queryFmt;
}

Having created a string representation of the query we’re after we pass it to a GraphMatch object, which is a kind of query were you specify a graph that is a kind of prototype for the structure of the results desired. I also created a simple class called ObjectDeserialiserQuerySink:

public class ObjectDeserialiserQuerySink<T> : QueryResultSink where T : OwlClassSupertype, new()
{
  public List<T> DeserialisedObjects
  {
  get { return deserialisedObjects; }
  }
  private List<T> deserialisedObjects = new List<T>();
  public ObjectDeserialiserQuerySink()
  {
  }
  public override bool Add(VariableBindings result)
  {
    T t = new T();
    foreach (PropertyInfo pi in OwlClassSupertype.GetAllPersistentProperties(typeof(T)))
    {
      try
      {
        string vn = OwlClassSupertype.GetPropertyUri(typeof (T), pi.Name).Split('#')[1];
        string vVal = result[pi.Name].ToString();
        pi.SetValue(t, Convert.ChangeType(vVal, pi.PropertyType), null);
      }
      catch (Exception e)
      {
        Debug.WriteLine(e);
        return false;
      }
    }
    DeserialisedObjects.Add(t);
    return true;
  }
}

For each match that the reasoner is able to find, a call gets made to the Add method of the deserialiser with a set of VariableBindings. Each of the variable bindings corresponds to solutions of the free variables defined in the query. Since we generated the query out of the persistent properties on the Track type the free variables matched will also correspond to the persistent properties of the type. What that means is that it is a straightforward job to deserialise a set of VariableBindings into an object.

That’s it. We now have a simple triple store that we can serialise objects into and out of, with an easy persistence mechanism. But there’s a lot more that we need to do. Of the full CRUD behaviour I have implemented Create and Retrieve. That leave Update and Delete. As we saw in one of my previous posts, that will be a mainly manual programmatical task since semantic web ontologies are to a certain extend static. What that means is that they model a domain as a never changing body of knowledge about which we may deduce more facts, but where we can’t unmake (delete) knowledge.

The static nature of ontologies seems like a bit of handicap to one who deals more often than not with transactional data – since it means we need more than one mechanism for dealing with data – deductive reasoningh, and transactional processing. With the examples I have given up till now I have been dealing with in-memory triple stores where the SemWeb API is the only easy means of updating and deleting data. When we are dealing with a relational database as our triple store, we will have the option to exploit SQL as another tool for managing data.

Powered by ScribeFire.

Using RDF and C# to create an MP3 Manager – Part 1

This article follows on from the previous post about semantic web applications in C#. I’ll be using the SemWeb framework again, but this time I chose to demonstrate the capabilities of RDF by producing a simple MP3 file manager. I haven’t completed it yet, and I’ll be working on that over the next few days to show you just how easy RDF/OWL is to work with in C# these days.

The program is pretty simple – I was inspired to write it by the RDF-izers web site, where you can find conversion tools to producing RDF from a variety of different data sources. While I was playing with LINQ I produced a simple file tagging system – I simply scanned the file system extracting as much metadata as I could from the files that I found, adding them to a tag database that I kept in SQL server. Well this isn’t much different. I just extract ID3 metadata tags from the MP3 files I find and store them in Track objects. I then wrote a simple conversion system to extract RDF URIs from the objects I’d created for insertion into an in-memory triple store. All-up it took about 3-4 hours including finding a suitable API for ID3 reading. I won’t show (unless demanded to) the code for the test harness or for iterating the file system. Instead I’ll show you the code I wrote for persisting objects to an RDF store.

First up, we have the Track class. I’ve removed the vanilla implementation of the properties for the sake of brevity.

[OntologyBaseUri("file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/")]
[OwlClass("Track", true)]
public class Track : OwlInstanceSupertype
{
[OwlProperty("title", true)]
public string Title /* ... */
[OwlProperty("artistName", true)]
public string ArtistName /* ... */
[OwlProperty("albumName", true)]
public string AlbumName /* ... */
[OwlProperty("year", true)]
public string Year /* ... */
[OwlProperty("genreName", true)]
public string GenreName /* ... */
[OwlProperty("comment", true)]
public string Comment /* ... */
[OwlProperty("fileLocation", true)]
public string FileLocation /* ... */

private string title;
private string artistName;
private string albumName;
private string year;
private string genreName;
private string comment;
private string fileLocation;

public Track(TagHandler th, string fileLocation)
{
this.fileLocation = fileLocation;
title = th.Track;
artistName = th.Artist;
albumName = th.Album;
year = th.Year;
genreName = th.Genere;
comment = th.Comment;
}
}

Nothing of note here except for the presence of a few all-important attributes which are used to give the persistence engine clues about how to generate URIs for the class, its properties and their values. Obviously this is a rudimentary implementation, so we don’t have lots of extra information about XSD types and versions etc. But for the sake of this illustration I’m sure you get the idea, that we can do for RDF pretty much what LINQ to SQL does for relational databases.

The attribute classes are also very simple:

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Property)]
public class OwlResourceSupertypeAttribute : Attribute
{
public string Uri
{
get { return uri; }
}
private readonly string uri;
public bool IsRelativeUri
{
get { return isRelativeUri; }
}
private readonly bool isRelativeUri;
public OwlResourceSupertypeAttribute(string uri)
: this(uri, false){}
public OwlResourceSupertypeAttribute(string uri, bool isRelativeUri)
{
this.uri = uri;
this.isRelativeUri = isRelativeUri;
}
}

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct | AttributeTargets.Property)]
public class OwlClassAttribute : OwlResourceSupertypeAttribute
{
public OwlClassAttribute(string uri)
: base(uri, false){}
public OwlClassAttribute(string uri, bool isRelativeUri)
: base(uri, isRelativeUri){}
}

[AttributeUsage(AttributeTargets.Property)]
public class OwlPropertyAttribute : OwlResourceSupertypeAttribute
{
public OwlPropertyAttribute(string uri)
: base(uri, false){}
public OwlPropertyAttribute(string uri, bool isRelativeUri)
: base(uri, isRelativeUri){}
}

[AttributeUsage(AttributeTargets.Class | AttributeTargets.Struct)]
public class OntologyBaseUriAttribute : Attribute
{
public string BaseUri
{
get { return baseUri; }
}
private string baseUri;
public OntologyBaseUriAttribute(string baseUri)
{
this.baseUri = baseUri;
}
}

OwlResourceSupertypeAttribute is the supertype of any attribute that can be related to a resource in an ontology – that is anything that has a URI. As such it has a Uri property, and in addition it has an isRelativeUri property which indicates whether the URI is relative to a base URI defined elsewhere. Although I haven’t implemented my solution that way yet, this is intended to allow the resources to reference a base namespace definition in the triple store or in an RDF file. The OwlClassAttribute extends the OwlResourceSupertype restricting its usage to classes or structs. You use this (or the parent type if you want) to indicate the OWL class URI that the type will be persisted to. So for the Track class we have an OWL class of “Track”. In an ontology that Track will be relative to some URI, which I have defined using the OntologyBaseUriAttribute. That attribute defines the URI of the ontology the Class and Property URIs are relative to in this example (i.e. “file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/“).

For each of the properties of the class Track I have defined another sublass of OwlResourceSupertype called OwlPropertyAttribute that is restricted solely to Property members. Another simplification that I have introduced is that I am not distinguishing between ObjectProperties and DatatypeProperties, which OWL does. That would not be hard to add, and I’m sure I’ll have to over the next few days.

So, I have now annotated my class to tell the persistence engine how to produce statements that I can add to the triple store. These annotations can be read by the engine and used to construct appropriate URIs for statements. We still need a way to construct instances in the ontology. I’ve done that in a very simple way – I just keep a counter in the scanner, and I create an instance URI out of the Class Uri by adding the counter to the end. So the first instance will be “file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/Track_1” and so on. This is simple, but would need to be improved upon for any serious applications.

Next I need to reflect over an instance of class Track to get a set of statements that I can add to the triple store. For this I have exploited the extension method feature of C# 3.5 (May CTP) which allows me to write code like this:
foreach (Track t in GetAllTracks(txtFrom.Text))
{
t.InstanceUri = GenTrackName(t);
store.Add(t);
}

The triple store is called store, GetAllTracks is an iterator that filters the files under the directory indicated by whatever is in txtFrom.Text. GenTrackName creates the instance URI for the track instance. I could have used a more sophisticated scheme using hashes from the track location or somesuch, but I was in a rush ;-). The code the the persistence engine is easy as well:

public static class MemoryStoreExtensions
{
public static void Add(this MemoryStore ms, OwlInstanceSupertype oc)
{
Debug.WriteLine(oc.ToString());
Type t = oc.GetType();
PropertyInfo[] pia = t.GetProperties();
foreach (PropertyInfo pi in pia)
{
if(IsPersistentProperty(pi))
{
AddPropertyToStore(oc, pi, ms);
}
}
}
private static bool IsPersistentProperty(PropertyInfo pi)
{
return pi.GetCustomAttributes(typeof (OwlPropertyAttribute), true).Length > 0;
}
private static void AddPropertyToStore(OwlInstanceSupertype track, PropertyInfo pi, MemoryStore ms)
{
Add(track.InstanceUri, track.GetPropertyUri(pi.Name), pi.GetValue(track, null).ToString(), ms);
}
public static void Add(string s, string p, string o, MemoryStore ms)
{
if(!Empty(s) && !Empty(p) && !Empty(o))
ms.Add(new Statement(new Entity(s), new Entity(p), new Literal(o)));
}
private static bool Empty(string s)
{
return (s == null || s.Length == 0);
}
}

Add is the extension method which iterates the properties on the class OwlInstanceSupertype. OwlInstanceSupertype is a supertype of all classes that can be persisted to the store. As you can see, it gets all properties and checks each on to see whether it has the OwlPropertyAttribute. If it does, then it gets persisted using AddPropertyToStore. AddPropertyToStore creates URIs for the subject (the track instance in the store), the predicate (the object property on class Track) and the object property (which is a string literal containing the value of the property). That statement gets added by the private Add method, which just mirrors the Add API on the MemoryStore itself.

And that’s it. Almost. The quick and dirty ontology I defined for the music tracks looks like this:

@prefix rdf: .
@prefix daml: .
@prefix log: .
@prefix rdfs: .
@prefix owl: .
@prefix xsdt: .
@prefix : .

:ProducerOfMusic a owl:Class.
:SellerOfMusic a owl:Class.
:NamedThing a owl:Class.
:TemporalThing a owl:Class.
:Person a owl:Class;
owl:subClassOf :NamedThing.
:Musician owl:subClassOf :ProducerOfMusic, :Person.
:Band a :ProducerOfMusic.
:Studio a :SellerOfMusic, :NamedThing.
:Label = :Studio.
:Music a owl:Class.
:Album a :NamedThing.
:Track a :NamedThing.
:Song a :NamedThing.
:Mp3File a owl:Class.
:Genre a :NamedThing.
:Style = :Genre.
:title
rdfs:domain :Track
rdfs:range xsdt:string.
:artistName
rdfs:domain :Track
rdfs:range xsdt:string.
:albumName
rdfs:domain :Track
rdfs:range xsdt:string.
:year
rdfs:domain :Album
rdfs:range xsdt:integer.
:genreName
rdfs:domain :Track
rdfs:range xsdt:string.
:comment
rdfs:domain :Track
rdfs:range xsdt:string.
:isTrackOn
rdfs:domain :Track
rdfs:range :Album.
:fileLocation
rdfs:domain :Track
rdfs:range xsdt:string.

When I run it over my podcasts, the output persisted to N3 looks like this:

<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/Track_1> <file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/title> “History 5 | Fall 2006 | UC Berkeley” ; <file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/artistName> “Thomas Laqueur” ;
<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/albumName> “History 5 | Fall 2006 | UC Berkeley” ;
<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/year> “2006″ ;
<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/genreName> “History 5 | Fall 2006 | UC Berkeley” ;
<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/comment> ” (C) Copyright 2006, UC Regents” ;
<file:///C:/dev/prototypes/semantic-web/src/Mp3ToRdf/fileLocation> “C:\\Users\\andrew.matthews\\Music\\hist5_20060829.mp3″ .

You can see how the URIs have been constructed from the base URI, and the properties are all attached to instance Track_1. Next on the list will probably be a bit of cleaning up to use a prefix rather than this longhand URI notation, then I’ll show you how to query the store to slice and dice your music collections every which way.

A simple semantic web application in C#

The latest update of the SemWeb library from Josh Tauberer includes a C# implementation of the Euler reasoner. This reasoner is able to go beyond simplistic RDFS reasoning – being able to navigate the class and property relationships – to make use of rules. The ontology I’ve been using to get used to coding in the framework models a simple state machine. The ontology couldn’t be simpler. Here’s an N3 file that declares the classes and their relationships.


@prefix daml: .
@prefix rdfs: .
@prefix owl: .
@prefix : .

#Classes
:State a owl:Class;
daml:comment "states the system can be in";
daml:disjointUnionOf ( :S1 :S2 :S3 ).

:InputToken a owl:Class;
daml:comment "inputs to the system";
daml:disjointUnionOf ( :INil :I1 :I2 ).

:Machine a owl:Class.
:System a owl:Class.

#properties
:isInState
rdfs:domain :Machine;
rdfs:range :State;
owl:cardinality "1".

:hasInput
rdfs:domain :System;
rdfs:range :InputToken;
owl:cardinality "1".

#Instances
:Machine1
a :Machine;
:isInState :S1.

:This a :System;
:hasInput :INil.

As with any deterministic finite state machine, there are two key classes at work here. :State and :InputToken. State is a disjoint union of :S1, :S2 and :S3. That means that :S1 is not an :S2 or an :S3. If you don’t specify such a disjunction, the reasoners cannot assume it – if there is no rule that says they are disjoint, the reasoner won’t be able to assume they’re different – just because Machine1 is in state S1, doesn’t mean it isn’t potentially in state S2 as well. You have to tell it that an S1 is not an S2. Pedantry is all-important in ontology design, and while I have gained a fair measure of it over the years as a defensive programmer I was shocked at the level of semantic support I get from the programming languages I use. OWL provides you with such a rich palette to work with, but less conventional support. It is kind of liberating to be designing class libraries in OWL vs. OO languages. Kind of like when you go from DOS to Bash.

Anyway, the rules for this little ontology define a transition table for the state machine:


@prefix log: .
@prefix rdfs: .
@prefix owl: .
@prefix : .

# ~>
{ :Machine1 :isInState :S1. :This :hasInput :I1. }
=>
{ :Machine1 :isInState :S1. :This :hasInput :INil. }.

# ~>
{ :Machine1 :isInState :S1. :This :hasInput :I2. }
=>
{ :Machine1 :isInState :S2. :This :hasInput :INil. }.

# ~>
{ :Machine1 :isInState :S1. :This :hasInput :I3. }
=>
{ :Machine1 :isInState :S3. :This :hasInput :INil.}.

I got into problems initially, since I thought about the problem from an imperative programming perspective. I designed it like I was assigning values to variables. That’s the wrong approach – treat this as adding facts to what is already known. So, rather than saying if X, then do Y, think of it as if I know X, then I also know that Y. The program to work with it looks like this:


internal class Program
{
private static readonly string ontologyLocation =
@"C:\dev\prototypes\semantic-web\ontologies\20074\states\";

private static string baseUri = @"file:///C:/dev/prototypes/semantic-web/ontologies/2007/04/states/states.rdf#";
private static MemoryStore store = new MemoryStore();
private static Entity Machine1 = new Entity(baseUri + "Machine1");
private static Entity Input1 = new Entity(baseUri + "I1");
private static Entity Input2 = new Entity(baseUri + "I2");
private static Entity theSystem = new Entity(baseUri + "This");
private static string hasInput = baseUri + "hasInput";
private static string isInState = baseUri + "isInState";

private static void Main(string[] args)
{
InitialiseStore();
DisplayCurrentStates();
SetNewInput(Input2);
DisplayCurrentStates();
}

private static void DisplayCurrentStates()
{
SelectResult ra = store.Select(new Statement(Machine1, new Entity(isInState), null));
Debug.Write("Current states: ");
foreach (Statement resource in ra.ToArray())
{
Debug.Write(resource.Object.Uri);
}
Debug.WriteLine("");
}

private static void InitialiseStore()
{
string statesLocation = Path.Combine(ontologyLocation, "states.n3");
string rulesLocation = Path.Combine(ontologyLocation, "rules.n3");
Euler engine = new Euler(new N3Reader(File.OpenText(rulesLocation)));
store.Import(new N3Reader(File.OpenText(statesLocation)));
store.AddReasoner(engine);
}

private static void SetNewInput(Entity newInput)
{
Resource[] currentInput = store.SelectObjects(theSystem, hasInput);
Statement input = new Statement(theSystem, hasInput, Input1);
store.Remove(new Statement(theSystem, hasInput, currentInput[0]));
store.Add(new Statement(theSystem, hasInput, newInput));
Resource[] subsequentState = store.SelectObjects(Machine1, isInState);
Statement newState = new Statement(Machine1, isInState, subsequentState[0]);
store.Replace(new Statement(Machine1, isInState, null), newState);
}
}

The task was simple – I wanted to set the state machine up in state :S1 with input :INil, then put input :I1 in, and see the state change from :S1 to :S2. In doing this I am trying to do something that is a little at odds with the expressed intention of ontologies. They are more static declarations of a body of knowledge as much as a specification for a dynamically changing body of facts. What that means is that they are additive – the frameworks and reasoners allow you to add to a body of knowledge. That makes reuse and trust possible on the semantic web[1, 2]. If I can take your ontology and change it to mean something other than what you intended then no guarantees can be made about the result. The ontology should stand alone – if you want to base some data on it, that’s up to you, and you will have to manage it. In practical terms, that means you have to manually change the entries for the input and the states as they change. What the ontology adds is a framework for representing the data, and a set of rules for working out what the next state should be. That’s still powerful, but I wonder how well it would scale.

What to notice

There are a few things in here that you should pay very close attention to if you want to write a semantic web application of your own using SemWeb. Firstly, the default namespace definition in the ontology and rules definition files. Generally, the examples of N3 files on the W3C site use the following format to specify the default namespace of a file:

@prefix : <#>

Unfortunately that leaves a little too much room for manoeuvring within SemWeb, and the actual URIs that it will use are a little unpredictable. Generally they are based upon the location that SemWeb got the files from. Instead, choose a URL like so:

@prefix : <http://aabs.purl.org/ontologies/2007/04/states/states.rdf#&gt;.

This was not the location of the file – it was just a URL I was using prior to using N3 format. The point is that you just need to give an unambiguous URL so that the semweb and its reasoner can distinguish resources properly, when you ask it questions. I used the same URL in the rules.n3 file, since most of what I was referring to was defined in the namespace above. I could just as easily have defined a new prefix for states.n3 and prepended all the elements in the rules with that prefix. The point is to have a non-default URL so that semweb is in no doubt about what the URL of the resources are that you are referring to.

Next, remember that you will have to dip into the instances in the store to make manual changes to their state – this is no different from any relational database application. Although, I was disconcerted at first, because I had hoped that the reasoner would make any changes I needed for me. Alas that is not in the spirit of the semantic web apparently, so be prepared to manage the system closely.

I think that the use of OWL ontologies would be very easy for question answering applications, but there may be a little more work required to place a reasoner at the core of your system. Of course, I could be wrong about that, and it could be my incorrigable procedural mindset that is to blame – I will keep you posted. I’ll be posting more on this over the next few weeks, so if there are things you want me to write about, or answers to questions you have, pllease drop me a line or comment here, and I’ll try to come up with answers.