Saturday, 18 April 2015

Is DARPA’s new search engine, Memex a Google-killer?

memex-deep-web-search-engine-700x336

When we look at the history of computing, it features a string of organizations that for sometime looked, as if they were so deeply entrenched in our lives and we would never be able to do without them.

For example, IBM and Microso ft looked liked that. In recent times, it has been Google and Facebook.

Sometimes they look indisputable because of the the narrow territory they occupy. When they fall, it is because of the situation that has changed drastically and not because someone has captured their territory.

For several years Linux enthusiasts proclaimed “this will be the year that Linux finally competes with Windows on the desktop!”; however, every year it did not happen.

Eventually Linux, smoked Microsoft under the brand name Android, when ‘Desktop’ gave way to ‘Mobile’.

Google has been the heavyweight and king of web search since the late 1990s. All efforts to throw Google out of the market have failed. Not only does it has a strong hold on the market share but it has also been able to keep off all challengers at bay ranging from awkward tech colossus to smart and disturbing startups.

Google will not surrender its territory to a Google duplicate but may one day find that its territory is not the same and the way it was earlier.

The web is getting broadened and darker and Google, Bing and Yahoo are not able to search most of it.

They don’t search the sites that have been asked to be ignored or that cannot be found by following links from other websites (the vast, virtual wasteland known as the Deep Web). They even don’t search the sites on anonymous, encrypted networks like Tor and I2P (the so-called Dark Web).

The big search engines do not ignore the Deep Web because there is some impassable technical limit that prevents them from indicating it. However, they do it because they are commercial entities and the costs and profits of searching beyond their current boundary don’t pile up.

Most of the time it is fine for us. However, this means that many sites go un-indexed and there are lots of searches that the current batch of engines are very bad at.

That is the reason the US’s Defense Advanced Research Projects Agency (DARPA) has invented a search engine for the deep web called Memex.

Memex is designed in such a way that it is one step further of Google’s one-size-fit all approach and deliver domain-specific searches that are the very best solution for limited importance.

DARPA, which is in its first year, has been handling the problems related to human trafficking and slavery, something that has a significant presence beyond the gaze of commercial search engines.

In February, when the first report on Memex was done, there were signs that showed that it had more potential than expected. However, what was not known that parts of it would become available more widely to the likes of everyone.

A lot of the project is still somewhat fuzzy and most of the 17 technology partners involved are still unnamed; however, the plan seems to have lift the curtains atlases partially though over the next two years, starting this Friday.

That’s when an initial tranche of Memex components, including software from a team called Hyperion Gray, will be filed on DARPA’s Open Catalog.

The Hyperion Gray team described their work to Forbes as:

Advanced web crawling and scraping technologies, with a dose of Artificial Intelligence and machine learning, with the goal of being able to retrieve virtually any content on the internet in an automated way.

Eventually our system will be like an army of robot interns that can find stuff for you on the web, while you do important things like watch cat videos.

More components are expected follow in December. A “general purpose technology” is expected to be available by the time the project ends.

Memex and Google don’t protrude much, as they solve different issues, serve different needs and they are financed in very different ways. But so were Linux and Microsoft.

The tools that DARPA would be releasing after the wrapping of the project probably won’t be a direct challenger to Google but they are expected to be sensible and better suited to certain government and business applications than Google is.

That might not be much of a bother to Google but there could be three reasons why Memex might catch its attention.

The first reason is that he web is changing and so is the use of Internet.

When Google was launched there was no Snapchat, Bitcoin or Facebook. Nobody was bothered about the Dark Web (remember FreeNet?) since nobody knew what it was for. Nobody even bothered about the Deep Web as it difficult to find the things you actually wanted.

The second is this statement made by Christopher White, the man heading the Memex team at DARPA, and who’s clearly thinking big:

The problem we’re trying to address is that currently access to web content is mediated by a few very large commercial search engines – Google, Microsoft Bing, Yahoo – and essentially it’s a one-size fits all interface…

We’ve started with one domain, the human trafficking domain … In the end we want it to be useful for any domain of interest.

That’s our ambitious goal: to enable a new kind of search engine, a new way to access public web content

And the third reason is that Memex is not just for spooks and G-Men, it is for the people like to use and very importantly, to play with.

To use software is one thing and to be able to change it is an another thing. The best thing about open source software is that it gives freedom to people to take it in new directions – the same way like Google did when it turned Linux it into Android.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.