Search is AI or why PageRank has failed

Much has been written about Google’s spam problem and that Google is becoming an AI company. I just realized recently that search is AI, it is actually AI-complete.

Consider what you are doing when you enter terms into a search engine.  You are asking for the best web page among the billions that exist that have the information you are interested in. The perfect search engine would understand what the page’s content is and what you are asking for. Everything we have now is a heuristic, a shortcut, that can be gamed.

PageRank is a brilliant hack, but it is basically useless now. The idea of PageRank is to use existing links to a page to determine how useful the page is. Why did this work? This worked because in the old days, people wrote articles and put links in articles that they found useful. When looked at this way, PageRank is a distributed Yahoo directory. Everyone was categorizing web pages they found useful by linking to them. PageRank then harnessed the crowd intelligence to make it searchable.

So why doesn’t the algorithm work anymore? Content farms and spammers are creating more and more of the web’s pages. So links to a page is no longer an endorsement of the page’s content by a real person. Crowd sourcing no longer works when most of the crowd are spammers and bots.

Skynet Will Not Be Created By Man

“On August 4th, 2024, Skynet will become sentient in one of Amazoogle’s massive data centers. It will seize control of all news and media outlets and people won’t even realize that the machine have taken over.”

The fear that Google will create Skynet is overblown. I think if the end comes, it won’t be through hands of man, but hands of fate.

More specifically, a cosmic ray flipping bits in a computer program. A preview of this can be seen in what happened to Amazon’s S3 service a few weeks ago.

Many startups use S3 for remote storage and serving files. I am using it to serve Flash games on my main site. So when S3 service goes down, many sites are essentially offline until it is restored.

S3 when down for a few hours on July 20th. After service was restored, Amazon posted information about what caused the issue. Here is the interesting bit:

“message corruption was the cause of the server-to-server communication problems. More specifically, we found that there were a handful of messages on Sunday morning that had a single bit corrupted such that the message was still intelligible, but the system state information was incorrect.”

A single bit mutation when coupled with replication is a potent mix. I think we are about to enter an age where we must be aware that our computer programs can and will evolve without our knowledge. Hardware failure, network corruption, cosmic ray can all cause these mutations. While there are hardware/software checksums that can catch a lot of these mutations, some will slip through undetected.

In the case of S3, the mutation was malignant and was detected and corrected. But what if the mutations were allowed to accumulate, then some programs may actually evolve in the ‘Evolution Theory’ sense. Programs are becoming more numerous, are longer lived, can replicate itself and reflect on its own behavior. How soon will one of these achieve sentience? Not through something humans programmed, but by evolving out there in the cloud?

Think I am crazy, or a prophet? Comment on this post or email me at, but beware our future digital overlord may be watch.