An Eighteenth Century approach to information processing

Nobody knows how many documents there are on the Web. Google claims to have indexed over three billion of them and this search engine covers only a small fraction of the total number. The total number is not the problem though. The real problem is that the contents of these documents are continuously changing and millions of new documents are being added every day. It is not surprising therefore, that so much time, effort and money is being put into trying to find ways of sorting and selecting information from this gargantuan morass.

Is there a solution?

Knowledge management (KM) is now a multi-billion dollar industry, yet nobody has yet come up with a truly efficient means of searching and sorting the information available on the Web. Search engines do a pretty good job, but they are far from perfect. Artificial intelligence techniques and agent technology have been used extensively but haven't yet been able to make any major impact.

It does seem as if technology is struggling to come up with optimum solutions. But, there is a solution that surpasses current technology. It was discovered in 1769, by a thirty-five-year-old Hungarian civil servant, Wolfgang von Kempelen, who invented the "Mechanical Turk", a chess playing machine that was the wonder of its day and could far outperform any technology of its time.

The Mechanical Turk

Tim O'Reilly recognized the potential of von Kempelen's machine when he wrote an article entitled How Users Participate in Building Google.

Dynamic languages like PHP, Perl, and Python are especially good at handling text and related forms of content, and when that content is changing all the time, you need a lightweight programming methodology to keep up. I sometimes compare these apps to von Kempelen's Mechanical Turk, an eighteenth century hoax that purported to be a chess playing machine, but actually had a man hidden inside. These applications aren't a hoax, but they do have a programmer inside the application all the time. Take the programmer out, and before long the application stops working"

Now, think about this idea of a "von Kempelen's Mechanical Turk" being applied to the sorting of information. Instead of using a sophisticated computer program to search for information, you could put a human with a computer (linked to the Internet) into a box. You could then ask this box a question and it could provide you with a far better answer than you'd get from an algorithmic computer program.

Of course, it would work even better if the man you put in the box was an expert in the subject area you were interested in. Better still, why not make the box bigger and put fifty experts in the box, then when you asked the box a question you'd get the combined knowledge of fifty people.

Figure 1 - The Kempelen Box metaphor

Just as "von Kempelen's Mechanical Turk" in the eighteenth century could surpass the technology of the day to play chess, so a modern day "Kempelen box" would be able to surpass today's limited ability of computer programs to efficiently sort through information.

Of course, we cannot put real people in a box. But, we can put software agents of real people in a box. These agents could be programmed to give the answers their owners would give if asked certain questions.