Big Business and the Semantic Web

It’s clear that the unofficial policy of both Microsoft and Google is “we don’t believe in the semantic web“. It may not be clear why. The answer is unsurprising when you give it some thought, though: Big Business. Semantic search holds out the hope of users being able to compose meaningful queries and get relevant results. The price is that someone somewhere has to write down the answers in a meaningful way.

Meaning is a tricky word to play with, but here I mean complex structured data designed to adequately describe some domain. In other words – someone has to write and populate an ontology for each domain that the users want to ask questions about. It’s painstaking, specialized work that  not just anyone can do. Not even a computer scientist – whilst they may have the required analysis and design skills, they don’t have the domain knowledge or the data. Hence the pace of forward progress has been slow as those with the knowledge are unaware of the value of an ontology or the methods to produce and exploit it.

Compare this to the modus operandi of the big search companies. Without fail they all use some variant on full-text indexing. It should be fairly clear why as well – they require no understanding of the domain of a document, nor do their users get any guarantees of relevance in the result sets. Users aren’t even particularly bothered when they get spurious results. It just goes with the territory.

Companies that hope or expect to maintain a monopoly in the search space have to use a mechanism that provides broad coverage across any domain, even if that breadth is at the expense of accuracy or meaningfulness. Clearly, the semantic web and monolithic search engines are incompatible. Not surprising then that for the likes of Microsoft and Google the semantic web is not on their radar. They can’t do it. They haven’t got the skills, time, money or incentive to do it.

If the semantic web is to get much of a toehold in the world of search engines it is going to have to be as a confederation of small search engines produced by specialized groups that are formed and run by domain experts. In a few short years Wikipedia has come to rival the likes of Encyclopedia Britannica. The value of its crowd-sourced content is obvious. This amazing resource came about through the distributed efforts of thousands across the web, with no thought of profit. Likewise, it will be a democratized, decentralized, grass-roots movement that will yield up the meaningful information we all need to get a better web experience.

About these ads

8 comments

  1. “If the semantic web is to get much of a toehold in the world of search engines it is going to have to be as a confederation of small search engines…”

    Those are prophetic words, and you are one of the few people who are looking into the future of Search in the right direction!

    At AltSearchEngines, we are applying that same logic to the alternative search engines that have been unable to mount a serious challenge to the “Big 5″ major search engines *because* they have been acting individually and not corporately.

    I invite you and your readers to come visit http://www.AltSearchEngines.com; the brand new blog from Read/WriteWeb!

    Charles Knight, editor
    AltSearchEngines.com

  2. Hi there,

    What an intriguing post! I agree with much of it and I’ll challenge you on some of it.

    Surely if anyone has skills, time and money it’s Google and Microsoft. Could it be that incentive is what’s missing?

    Right now there’s no semantic option to challenge what they’re doing, and if the entire web becomes semanticized (just made that word up), then the big boys will be in just as good a position as anyone to adapt to the new framework.

    I love Wikipedia and I’m a huge fan, but just this morning I put up a post on some of the challenges of crowd-sourced content.

    Your idea of the confederation of small search engines is a great one. Charles Knight at AltSearchEngines is proposing something similar.

    I think what will make the difference is for the <a href=”http://blog.vortexdna.com/solve-for-semantics-at-the-search-engine-level/”search engines themselves to become more semantically aware. Perhaps a universally implemented semantic framework is a possibility in the future, but the challenges associated with implementing it consistently and getting everyone to agree to use it are not inconsiderable.

    Hmm… not sure if any of that made sense! But I’ll be keen to hear your thoughts.

  3. Interesting thoughts! I am of the personal belief that Microsoft may not play a big part in the Semantic Web. It just doesn’t seem like their style. Google I have hope for. They are working on some neat things right now, specifically their Programmable Search Engine technology and the new knowledge base they are working on. Truth be told I don’t think either company is going to be the leader in Semantic Web technology unless they start hiring up the people close to the bleeding edge of its development AND become serious about it.

  4. Hi Charles,

    prophetic? I certainly hope so. There is much room for improvement in the search space, and any advances there will either boost or be founded on general improvements in semantic content on the web. I shall certainly be taking a keen interest in what you have to say, and in hearing more about advances in this space.

    Andrew

  5. Hi Kaila,

    Although I can’t be specific, the ‘we aren’t interested and neither is google’ came from a highly placed individual within MS. So it’s not just based on their past lack of interest!

    Globally coordinated and unanimously agreed upon ontologies are a pipe dream that will founder in just the same way as early efforts at common business EDI and B2B messaging formats that came about after the advent of XML. No consensus is likely. Hence why I thought of a crowd-sourced model. I also think that some easily comprehensible foundation of an upper ontology would be needed. I suppose for that to be adhered to, there will definitely need to be more sophisticated and dedicated tools support for the production of compatible domain models.

    Andrew

  6. Hi James,

    See my response to Kaila.

    It’s not wealth or skills that either company lacks – it’s expertise in the domains to be modeled. I’d have thought that OpenCyc would be a better bet for general purpose ontologies than Google, but if you’ve looked at that you’d see that it is a birds nest of concepts that would never become popular for simple data modeling…

    Andrew

  7. Perhaps the search engines should start allowing people to rate links based on accuracy to what they typed in, then again you would have people paying people off to go around and rate sites. If only people were honest. Decent article, thanks.

Comments are closed.