El.pub Analytic Issue Number 6
Top | Topic News | Topics | Search | Feedback
Contents: The problem | Hypertext, SGML and search engines | Mixing tagging and linking | Navigating islands of information | Problem summary | The semantic web: what is it? | what will it do? | what do they say about it? | will it work? | Conclusions
At about the same time as the arrival of desktop aids like mice and windows, a new idea in text processing arose - hypertext. Again, this was thought of earlier but it had taken a long time before the computing environment was available to implement it. The ability to click on a link and jump elsewhere in a document (later to other documents) was an ideal solution (it appeared) to the growing difficulty of managing text information on the screen.
Mixing the two ideas, tagging and linking, resulted in HTML and the web. Although it is worth pointing out that they only reached that level when the Internet was opened to public use; hypertext was initially confined to information on CD-ROM.
The enormous size of the web in terms of the information available, has shown the limitations of hypertext. The first limitation is that simply following links wont find what you want quickly in an information network of the current dimension. The second is that not all information can be guaranteed to be reachable from a specific start link; there are islands in the overall link map.
The initial attempt to solve these problems was the search engine. A computer somewhere reads all the pages on the web, creates an index and makes it available to other computers on the web. There are four problems. First as we have seen there are islands, so finding all the pages on the web isn't so easy, particularly since some of the computers don't want other computers indexing them. Second, as the web has evolved the pages aren't static and keeping the index up-to-date is very difficult. Third, many pages are now created on the fly from databases and depend on the query received, so are inherently not capable of being indexed. Fourth, and perhaps most importantly, indexing and querying the index are language processing computer applications that do not have anything like an effective solution at this stage.
It is the case that the more successful recent attempts to improve search engine performance have used human added information rather than automated methods. For example, Google uses the extent of linking between pages to add accuracy. This linking is essentially a choice made by the authors of the pages at creation time and not the result of some automated process. Linking to another page is generally the result of a value judgement by the author and not simply a link because of some word similarity.
Back to previous page Forward to next page Return to main index
Comments on the content, style and analysis are welcome and may be published; send them to: mailto:ketlux@compuserve.com
URL: download the WinWord document, from: http://www.elpub.org/analytic/analytic06.doc
El.pub News |
A free email alerter of the latest news items and associated URLs. |
File
Downloads - Please note
|
File downloads from the El.pub site are currently suspended - the links however have not been updated to reflect this. If you would like access to a particular download file - please email webmasters@elpub.org with a suitable request confirming a description of the file you wish to download. |
El.pub - Interactive
Electronic Publishing R & D News and Resources
We welcome feedback
and contributions to the information service, and proposals for subjects for
the news service (mail to: webmasters@elpub.org)
Edited by: Logical Events Limited - electronic marketing, search engine marketing, pay per click advertising, search engine optimisation, website optimisation consultants in London, UK. Visit our website at: www.logicalevents.org
Last up-dated: 16 February 2024
© 2024 Copyright and disclaimer El.pub and www.elpub.org are brand names owned by Logical Events Limited - no unauthorised use of them or the contents of this website is permitted without prior permission.