Can you imagine a world where computers could communicate flawlessly?
That is the dream of many, including myself. I dream of a future when applications will be able to understand information in context, this future is quite far off but we are making some progress towards it.
XHTML, is one language which developers and content publishers are slowly but surely starting to grasp onto. A far stricter and more content orientated version of HTML, XHTML allows users to markup documents so that whoever/whatever is reading them understands the context of the information, e.g. <strong> to donate a piece of text with a strong emphasis.
XHTML when used correctly can be a powerful tool towards building a knowledge base, however, it cannot be used exclusively, other problems are still proposed, consider this example:
I want to search for a blue car on the internet, several companies have web sites dedicated to displaying their range of cars. If the pages are correctly marked up, it is likely that a search engine can find related pages, but that system isn’t foolproof. What we need is a way of marking up car details.
Now this is where XML comes in, XML is a way of creating your own markup for specifying anything from car details to tv listings. The main issue here is that the motor industry would have to come up with their own specification, and we all know how long it takes for something to get done in corporate partnerships.
OK, so lets say that by some miracle there is an industry standard markup language and that all dealers subscribe to it and publish listings on their web site. Now a search application could index these listings and then I would be able to find my blue car right?
Well, not really, we now have another issue, semantics. Blue is a colour, so is turquoise so is cyan. When I say blue, I sometimes mean cyan, I sometimes mean navy. There are so many different names to describe the subtle and not so subtle variations in colour and an application would have to take this into account. But how do you explain to an application that blue means cyan means turquoise? Do we just stop here and come to the conclusion that humans have to become slightly better at describing?
There are a few solutions, one is building a relationship database, where colours and other similar properties can be put into related groups and can thus be referenced. However, this again requires a central body to create a maintain the database. Another solution would be to leave it up to the applications themselves, and let market forces decide which ones are better for the job, a far more decentralised approach. A third would be to have a publicly maintained database, Wikipedia anyone?
And there you have it, if all the above elements were in place searching would become a hell of a lot easier, there is only one problem…..
How do you trust the people producing the page to be honest about their content/products ?