In search of the deep Web (09 Mar 2004)
Soon-to-launch Dipsie is pursuing an alternative approach to unlocking the dynamic Web, by deploying a kind of souped-up spider that penetrates barriers like forms, drop-down lists, dynamically generated URLs and session cookies. Dipsie's spider works by emulating a "well-formed user" that, from the Web site's point of view, behaves just like a real flesh-and-mouse user, enabling the spider to cache the kind of data typically visible only to a human user.
Other search developers, including IBM, Google and Intelliseek, are exploring their own approaches to mining the deep Web. But in the wake of this week's announcement, Yahoo is now the elephant in the living room.
Yahoo won't discuss the specifics of how its search algorithms work. But the company does acknowledge that its Content Aggregation Program will give paying customers a more direct pipeline into its search database. Yahoo Search vice president Tim Cadogan says, "Ultimately we want to search the whole Web for free," but he nonetheless sees the CAP program as a way of enabling "direct, structured relationships with content providers" to "deliver a higher-quality search experience for users."
Article URL: http://www.salon.com/tech/feature/2004/03/09/deep_web/index_np.html
Read 47 more articles from Salon sorted by
date,
popularity, or
title.
Next Article: Media: Revolution pinpoints the Problem with Web Advertising
|