Contents: The problem | Hypertext, SGML and search engines | Mixing tagging and linking | Navigating islands of information | Problem summary | The semantic web: what is it? | what will it do? | what do they say about it? | will it work? | Conclusions

The problem

Where are we with the web today? There are several dimensions to the web. First there is the web as a network. It sits on top of the Internet that physically connects together thousands of independent networks of computers and provides a means of addressing the individual computers so that messages can be sent from one to another. Second there is the web as a specific type of message environment. The Internet enables e-mail, file transfer and a long list of other types of message to be interchanged, each with its own set of rules (different protocols). The web is a set of rules to allow message pairs that consist of requests and replies in relation to pages of information with a specific presentation syntax / semantics (HTML). The key feature is that links (URLs) to other pages can be embedded in the content. Third the web is the totality of the information that exists on those pages.

One can say with some certainty that the original designers did not anticipate that after a few years there would be over a billion pages on the web. To understand the significance of the web it is necessary to look at how computing has developed. Initially computers were used for complicated calculations that could not be carried out at all using pencil and paper or mechanical calculators. They were used next for repetitive tasks where a large number of simple calculations had to be carried out frequently, mainly accounting and other forms of 'book-keeping'. The success in these areas led to an expansion in the range of tasks of these sorts that were carried out. For example, statistics, which before the advent of computers had been almost entirely a theoretical branch of mathematics became a practical tool in every day use by business and government. Accounts, that had previously been dumped into a cellar at the end of each year, suddenly became a source of information that could be used to improve decision making. New methods of analysis were invented to use the information and computing power that was rapidly becoming available.

Up to the 1980s and the arrival of desk top computers almost all application of computers dealt with numeric data. Computers printed out their results on thin fan-fold paper in a difficult to read type face and secretaries typed up reports on the results, dictated or hand written by middle managers. Output to screens was largely confined to transaction based systems in real time critical areas such as airline reservations, process control systems and order entry. Networked computers talked only to each other.

As soon as the PC killer application - word processing - emerged, everything changed. The computer now became a tool for everyone who had something to write about, whether it was personal letters or 500 page reports. There was also a growth in research on word and then language processing and a related but slower growth in applications such as spell checkers. (There had been earlier language processing research in the 60s and 70s but without the impetus offered by desktop computing applications).

Since the early 90s one could say that the primary use of computers has been in aiding human communication, first in text then in images and lately in audio-visual forms. The result has been an enormous increase in the actual amount of information created for human consumption particularly in the variety rather than the sheer volume. A consideration of the growth in the numbers of newsletters available via e-mail and the web shows the impact Internet economics has had on the production and distribution of information (as opposed to its intellectual creation).

The flip side of all this is the staggering amount of archived information available that was not written in the last few minutes. Surely amongst all this communication there must be the very sentence that will answer the question I am pondering at the moment. But how to find it? (And in the next 20 seconds).

