Technology

Search is AI or why PageRank has failed

Much has been written about Google’s spam problem and that Google is becoming an AI company. I just realized recently that search is AI, it is actually AI-complete.

Consider what you are doing when you enter terms into a search engine.  You are asking for the best web page among the billions that exist that have the information you are interested in. The perfect search engine would understand what the page’s content is and what you are asking for. Everything we have now is a heuristic, a shortcut, that can be gamed.

PageRank is a brilliant hack, but it is basically useless now. The idea of PageRank is to use existing links to a page to determine how useful the page is. Why did this work? This worked because in the old days, people wrote articles and put links in articles that they found useful. When looked at this way, PageRank is a distributed Yahoo directory. Everyone was categorizing web pages they found useful by linking to them. PageRank then harnessed the crowd intelligence to make it searchable.

So why doesn’t the algorithm work anymore? Content farms and spammers are creating more and more of the web’s pages. So links to a page is no longer an endorsement of the page’s content by a real person. Crowd sourcing no longer works when most of the crowd are spammers and bots.

Creating a “Word Lens” Like Engine in N Steps

Augmented reality is a top topic right now, especially with iPhone’s hardware capabilities finally catching up to people’s imagination. I recently saw “Word Lens” app in the App Store and their demo video really got me excited about this space.

In the next series of posts, I’ll document my attempt to create a framework that can do what Word Lens is doing. I’ve been meaning to add video and audio processing capabilities to Cloud Browse for a while now, so this will be a good excuse for me to dig into how to access the camera in iOS.

Overall Plan

Word Lens uses OCR to detect letters in video frames, then it removes them and replaces the letters with translated text while matching the location and size of the replaced text.

Abstracting away the details, you have these basic of steps:

  1. Capture video frame data
  2. Detect features
  3. Remove unwanted features
  4. Rendering augmentation

What is impressive about Word Lens is that they do all that on the iPhone itself, it is convenient for the user, but we just care about duplicating the functionality right now, so we’ll be doing most of our work on the desktop. Apparently they spent 2 1/2 years creating their technology, we will use off the shelf open source technologies and cobble them together quickly to get the desired effect.

Here is my plan:

  1. Capture video on iPhone and send to server
  2. Detect letters using Tesseract-ocr
  3. Remove letters using techniques found in Patch Match or Resynthesizer
  4. Translate using Google Translate
  5. Render translated word using ImageMagick
  6. Send image back to be displayed on iPhone

The hardest step seems to be removing the letters since there isn’t a ready made library for doing this. We’ll tackle that step last. The first step will be to get image to the server and running OCR on it. That will be the subject of the next post.

Variables Variable for Ruby

Envious about PHP’s variables variable feature? Now you too can use it in your ruby code. Try pasting this in your irb.

class VariableVariable

  instance_methods.each { |m| undef_method m unless m =~ /^__/ }

  def initialize(name, bind)
    @variable = name
    @bind = bind
  end
  
  def ~
    v = eval(@variable, @bind)
    if v.instance_of? String
      VariableVariable.new(v, @bind)
    else
      v
    end
  end
  
  def method_missing(symbol, *args, &block)
    @variable.__send__(symbol, *args, &block)
  end

end

a = VariableVariable.new('b', binding)
c = 'a'
b = 'c'
a
~a
~~a
~~~a

Section 3.3.1 is Good News to Native iPhone Developers

There has been many reactions to the recent iPhone Developer Program License Agreement change, from apologetic, to mocking and to stunned silence.

I am not going to fight for any ideological stance, or trying to figure out why Apple made the change. It is done and won’t change anytime soon, so what remains is for smart people to adapt to the situation and seek opportunity from change.

One thing is for sure, Objective C and Cocoa skills are more valuable now than before the change. Adobe Flash CS5 was poised to bring hundreds of thousands of Flash developers pouring into the App Store. Whether they will bring tons of crappy ports with them is immaterial, it would have changed the landscape of the App Store, especially in the games area. This would have been bad news to all iPhone game developers.

Of all the fury and bluster on the web right now, how many have developed iPhone applications? How many of them use native development, i.e. using C/C++/ObjC? Native iPhone developers’ situation has only improved not diminished from this change. So people yelling right now are yelling at Apple for rewarding people who helped build the iPhone platform and not allowing people who sat on the sidelines looking to make a quick buck into the party.

Another effect of Section 3.3.1 is that iPhone version of an app will become the primary target of any cross platform app. Here is an idea for enterprising people out there, make a Cocoa to Flash compiler, or a Cocoa to Android compiler. Apple just created this market for you guys out there to fill. You can rail against the sky and cry why, or you can make the best of it and create a product that people will want. The choice is yours.

Empathy for the Daemon, or are you a good simulator?

Game designer Sid Meier, creator of the “Civilization” series of video games gave a talk recently called “The Psychology of Game Design (Everything You Know Is Wrong).” His main thesis was that game designer have to understand the player’s psychology when designing the gameplay elements and gave some examples from his games which violated player expectations.

Now, speaking from a programmer’s perspective, one of the prerequisite of being a good programmer is the ability to read code and understand the effects of its execution. Without this basic skill, coding becomes a matter of trial and error. Bad code will confound even good programmer’s ability to understand it, and that is one of the code smells that I look for. Quite simply, programmers must be able to simulate a computer in their heads. Some programmers can even simulate a computer down to the hardware level, to optimize registers and cache lines.

So how does this relate to Meier’s talk? Well, what he is saying is basically that game designers need to simulate the player in their heads. Each gameplay mechanic and every decision about the game that designers make must take into account how it will affect the player. But to be able to simulate the player, you must first have a model of how a player behaves. That’s why understanding player psychology is the important.

Generalizing even further, this ability to simulate is fundamental to being human, it is also called empathy. Empathy is the ability to share another person’s feelings and emotions. This can happen because we place ourselves in the shoes of another person. Some studies have shown that our neuron firing actually mirrors another person’s from just observing their emotions.

So work on improving your empathy, whether it’s for players of your game, users of your website, or executor of your code.

Cloud Computing is Just Rebranding. And That’s a Good Thing.

Larry Ellison recently ranted about cloud computing at an Oracle stock holder’s meeting.

[youtube]http://www.youtube.com/watch?v=8UYa6gQC14o[/youtube]

Someone asked if Oracle will be in trouble since there is a big push towards cloud computing. His answer is that cloud computing is what computers and internet has always been. The technology is the same, just called different things: SaaS, on demand software and now cloud computing.

He is right of course. Cloud computing still uses databases, CPUs, memory, harddrive and routers. It isn’t some new technology. But that’s also like saying Ajax wasn’t new because it is just XMLHttpRequest.

Cloud computing is important because it names a user experience, much like Ajax. Technology is something computer programmers and technology geeks care about, but users just care about “what does this mean to me?” What cloud computing means to the consumers is “let someone else take care about the technical details, you just use it to get your job done.”

I would like to think that the fact that cloud computing is catching on is indicative of a trend for technical people to think more terms of the ends rather than the means.

Here is an example of someone thinking about the technology rather than the experience: “No wireless. Less space than a nomad. Lame.” Of course, the iPod went on to dominate the MP3 player market. Don’t be that guy.

How iPad Multitasking Might Work

There has been much complaining on the web about how iPad doesn’t support multitasking, and little analysis of why Apple doesn’t allow it. Adam C. Engst wrote up a great analysis of the types of benefits that multitasking can provide and what technology is required to achieve those benefits.

I’d like to approach the question of multitasking from the user’s perspective and not from a technology standpoint. How would allowing third-party background processes impact the user experience?

Games Don’t Want Multitasking

Gaming is a largest category of the applications for the iPhone. I come from a video game background, so lets compare how game consoles approach multitasking to the iPhone.

Game consoles don’t allow third-party background processes, period. Even first-party background processes are limited. Xbox 360’s dashboard and PS3’s XMB is the interface to access system functionalities while playing a game. Both systems also reserve a small amount of memory and CPU to run system functionalities.

PS3 XrossMediaBar

The reason why consoles take such extremes to limit multi-tasking is to give games consistent hardware performance. Games typically pushes the performance of the system to the limit. When a game updates at 60 frames per second, a small decrease in performance will drop the frame rate to 30, which is immediately noticeable by the player.

Now imagine a game developer writing games for the iPad. It would be impossible to squeeze every ounce of performance out of it if at any point a random third-party application can wreck your performance and turns your game from running smoothly to slowing to a crawl.

For this reason alone, I don’t think any iPad application should expect to be running in the background at all times.

Multitasking on iPad

Believe it or not, there exists an API that manages multitasking on the iPhone. It is called Audio Sessions. An application can claim exclusive access to hardware audio codecs and stop the iPod application from running in the background by setting the appropriate audio session category.

I think this could be a glimpse into how Apple will implement multitasking on the iPad. Apple can define a set of categories for multitasking and each application will set it’s category to the most appropriate one.

Here are some categories which I think might be usedful:

  • Multitask Session Default – allow other application to run in the background with the current application. Quit the current application when the home screen is brought up.
  • Multitask Session Background – allow other applications to run in the background and also wants to keep running when another application starts, maybe quit if an exclusive application starts.
  • Multitask Session Background Resume – Same as background, except restart the application in the background when possible, i.e. after an exclusive application stops.
  • Multitask Session Exclusive – Do not allow other applications to run in the background. Quit the current application when home screen is brought up.

By defining the multitasking behavior through the API, Apple will still have complete control over the user experience, without exposing the complicity of managing processes to the users.

The Apple Way

I am just conjecturing how Apple might add multitasking to the iPad. I don’t have any sources inside Apple. However I believe that Apple will provide the best possible user experience when using the iPad. If they determine that multitasking will decrease the overall user experience then they will not add it.

Like the Cut-and-Paste complaints of a few years ago, Apple will wait until it finds the best possible way of introducing a functionality before adding it. That is the Apple way.

Using WordPress to Build a Website – Part 2

Introduction

In part 1, I wrote about how to create a static home page for the top level of your website, with all of your blog related pages under “/blog” path. This article is about building out the rest of your site using static pages and adding a navigation menu by selecting the right theme.

Before moving on, you should add a few more pages to your site, something like this.

  • About
  • Contact
  • Sitemap
  • Products
    • Subproduct 1
    • Subproduct 2

Of course you should change the names of the pages to suit your needs. Make sure Products is a parent page to Subproduct 1 and Subproduct 2 by setting the Parent attribute of the sub pages.

Finding a Theme

Now that you have some content on your site. It is time to find a theme that you like and to display the site navigation in a way you require.

The important part of selecting a theme is to make sure it uses the page hierarchy as navigation items instead of the post categories. So when previewing the theme, make sure the navigation looks like this:

Page Hierarchy as Navigation

Instead of this:

Post Category as Navigation

You want to provide easy navigation to your pages first, and your blog second. Another thing to look for is a drop down menu to display hierarchy of pages.

CSS Dropdown Menu Items

It is much easier not use the drop down menu feature than to add it later by hacking CSS to add it to the theme you selected.

Managing Your Navigation Menu

Now that your navigation menu is tied to your page hierarchy, you should use a plugin to manage it easily. I use a plugin called pageMash. It allows you to hide pages you don’t want to display in the navigation menu, as well as drag and drop to modify the page hierarchy.

Managing the Navigation Menu

In addition to managing the page hierarchy, you also need to manage the label of the page in the navigation menu vs the title of the page that’s displayed when you view the page. My about page’s actual title is “About AlwaysOn Technologies” but navigation label is just “About”.

This ability is added by another plugin, All in One SEO pack. This plugin has more features than I can get to in the article, you can dig deeper into it in tutorials. All we are going to do for now is set the Title, which will be label displayed on the menu.

Setting the Menu Label

Adding a Contact Page

To making contacting you easier, you need a contact page where users can send message directly without having to fire up an email client. Thankfully, there are plenty of plugins that will create a contact form for you. The one I used is Contact Form 7 which uses markup to create the form so it is very flexible.

By now, you should have a functional website that the users can actually navigate and contact you. The next article will be about sending your site information to search engines and leveraging social media.

Using WordPress to Build a Website – Part 1

Introduction

WordPress is a popular and well supported blogging platform and I’ve been using it for years for blogging. Now I am using it to build up a small website.

I am getting ready to launch an iPhone application called Cloud Browse and need a web presence to promote my company and the product. Since I am going to blog on this new site, using WordPress to do everything makes a lot of sense.

There are a lot of articles on the web talking about how to use WordPress as a Content Management System (CMS), but most of them are a jumbles list of tips and plugin suggestions.

I am going to write step by step how I got my site up and running, along with the plugins I used to and why I chose them. My site is very simple so hopefully it will be useful to others to get started before diving deeper with other articles.

The Basic Steps

Disclaimer: I am using WordPress 2.9.1, so if you are using an older version, the steps I am using may not work for you.

A plain WordPress install already include all you need to get a simple site up and running. Assuming you want a home page at http://example.com, and the blog at http://example.com/blog, just follow the steps below. I’ll describe what each step in detail after.

  1. Add two pages called “Home” and “Blog”

    Add Pages

  2. Go to Dashboard > Settings > Reading and set the Front Page Display to “A static page” with Front Page as “Home” and Post Page as “Blog”

    Setting Front Page to a Static Page

  3. Go to Dashboard > Settings > Permalinks, change Common settings to “Custom Structure” to “/blog/%year%/%monthnum%/%day%/%postname%/” and set Optional > Category Base to “blog/topics” and Optional > Tag Base to “blog/tags”

    Move Blog Permalinks to a Custom Path

If you’ve been following the steps, now you should have a basic web site displaying the “Home” page at the top level, and a regular blog under “/blog.” Read on to dive into what each of the steps actually did and how to customize each step.

The Gory Details

  1. The Home and Blog pages are required for step 2. The title of the home page isn’t important, but the contents of the page is what gets displayed at the top level of the site.The title of the Blog page is important since it is what determines the path the blog shows up under, more specifically, it is the slug of the page that determines the path. So if you want your blog to show up under “/diary” then make sure you name your page “Diary.”
  2. Front page displays is the important setting that allows the blog section of your site to be separate from the rest of the pages. WordPress will show the Front page as the top level content and the list of of blog postings when viewing the Posts page. WordPress won’t even display the contents of the Posts page so don’t bother putting any content in it.
  3. Using a custom permalink structure is important for two reasons. One is so users can see the title of the post within the url, which is also important for search engine ranking, and the second is so all blog posts shows up under “/blog”. You can customize the structure but be sure to start with “/blog.”Setting the Category base and Tag base is again to keep all blog related urls under “/blog.” If you don’t set this, the default url would look like “http://example.com/categories/a-category.”

Now you’ve got enough knowledge to start a basic site using WordPress. The next article will focus on creating a site structure by adding more pages and setup the navigation menu.

Learning from Alan Kay, 11 years later

I first saw this video on reddit about a year ago, a famous talk by Alan Kay given about 11 years ago. I’ve watched it many times and each time understanding it a bit more.
[googlevideo]http://video.google.com/videoplay?docid=-2950949730059754521[/googlevideo]

The first time I watched it, I found his proposal that every object on the internet should have an URI and/or an IP address interesting (43:00). This was totally opposite of what I was doing at the time. We were near the end of the development cycle on BioShock and memory optimization was in full swing. Objects in Unreal engine are pretty heavyweight so we spent a lot of time stripping out unnecessary variables and functionality from the base UObject class.

Since last year, I’ve learned a bit more about web technologies and now see what Alan Kay was talking about is realized by REST style web services. A little bit later in the talk (52:30), Alan expounded a bit more about interoperability on the web and says “[object capability discovery] is going to be the critical thing to automate in the next ten years,” which “allows an interchange of deep information about what objects think they can do.”

This has obviously not happened. Anyone that has to write against web services like Amazon, Facebook or Flickr knows a lot of trial and error is needed to make anything work, at least thats been my experience trying to use their REST api. Alan proposed creating a “universal interface language” to exchange this information, which sounds a little like WSDL, but what about automatic discovery of services? I haven’t found a protocol for REST introspection like there is for XML-RPC.

If the strength of REST is its simplicity and low barrier to entry, than perhaps introspection and automatic discovery would add too much complexity. Since most web services provide client libraries for most popular languages (PHP/Java/Python/Ruby) all this REST vs XML-RPC vs SOAP can be hidden from the programmer.

Yet I can’t help but think about the last advice Alan gave in the talk, “play your system more grand than it is right now.” Wouldn’t it be grand if a client lib for an object can be automatically created given its URL? Wouldn’t it be grand if your application that upload to Flickr can upload to Smugmug just by changing the URL?