Fixing the space bar and future bugs

Since the space bar problem has been affecting users for a couple of weeks. I’d like to address what the original bug is, why we encountered it and what we are doing to prevent problems in the future.

The origin of the bug is actually within iOS. If a text area is too small, spaces can’t be entered into the text area. The OS thinks that space has been entered even though the text area doesn’t reflect it. That’s why in version 2.3.2 of the app, you can tap space twice to activate the period shortcut and it would delete the previous character. The app uses a 1×1 hidden text area to receive input from the keyboard and send it to the browser on our servers. The size of the text area was changed in 2.3.2 to 1×1 which triggered the bug.

Version 2.3.3 tried to fix the problem by adding a space to the text area in code. This worked for iPhone 4.1 and iPad 4.3 which are our normal test devices. Unfortunately, the fix doesn’t work on iPhone 4.2 or 4.3.

These two releases highlighted some holes in our testing procedure which we are going to fix by instituting these steps:

  1. Testing of all keyboard input. Previously, we only tested some letters, special symbols and unicode input.
  2. Testing iOS versions on all test devices. Previously, we only tested each iOS version on one test device.
  3. Add a iPod Touch test device.

We hope these steps will prevent situations like the space bar bug from happening in the future and apologize for the users that were affected. Thank you for being our customers and being patient while we put out a new version.

Search is AI or why PageRank has failed

Much has been written about Google’s spam problem and that Google is becoming an AI company. I just realized recently that search is AI, it is actually AI-complete.

Consider what you are doing when you enter terms into a search engine.  You are asking for the best web page among the billions that exist that have the information you are interested in. The perfect search engine would understand what the page’s content is and what you are asking for. Everything we have now is a heuristic, a shortcut, that can be gamed.

PageRank is a brilliant hack, but it is basically useless now. The idea of PageRank is to use existing links to a page to determine how useful the page is. Why did this work? This worked because in the old days, people wrote articles and put links in articles that they found useful. When looked at this way, PageRank is a distributed Yahoo directory. Everyone was categorizing web pages they found useful by linking to them. PageRank then harnessed the crowd intelligence to make it searchable.

So why doesn’t the algorithm work anymore? Content farms and spammers are creating more and more of the web’s pages. So links to a page is no longer an endorsement of the page’s content by a real person. Crowd sourcing no longer works when most of the crowd are spammers and bots.

Creating a “Word Lens” Like Engine in N Steps

Augmented reality is a top topic right now, especially with iPhone’s hardware capabilities finally catching up to people’s imagination. I recently saw “Word Lens” app in the App Store and their demo video really got me excited about this space.

In the next series of posts, I’ll document my attempt to create a framework that can do what Word Lens is doing. I’ve been meaning to add video and audio processing capabilities to Cloud Browse for a while now, so this will be a good excuse for me to dig into how to access the camera in iOS.

Overall Plan

Word Lens uses OCR to detect letters in video frames, then it removes them and replaces the letters with translated text while matching the location and size of the replaced text.

Abstracting away the details, you have these basic of steps:

  1. Capture video frame data
  2. Detect features
  3. Remove unwanted features
  4. Rendering augmentation

What is impressive about Word Lens is that they do all that on the iPhone itself, it is convenient for the user, but we just care about duplicating the functionality right now, so we’ll be doing most of our work on the desktop. Apparently they spent 2 1/2 years creating their technology, we will use off the shelf open source technologies and cobble them together quickly to get the desired effect.

Here is my plan:

  1. Capture video on iPhone and send to server
  2. Detect letters using Tesseract-ocr
  3. Remove letters using techniques found in Patch Match or Resynthesizer
  4. Translate using Google Translate
  5. Render translated word using ImageMagick
  6. Send image back to be displayed on iPhone

The hardest step seems to be removing the letters since there isn’t a ready made library for doing this. We’ll tackle that step last. The first step will be to get image to the server and running OCR on it. That will be the subject of the next post.

Variables Variable for Ruby

Envious about PHP’s variables variable feature? Now you too can use it in your ruby code. Try pasting this in your irb.

class VariableVariable

  instance_methods.each { |m| undef_method m unless m =~ /^__/ }

  def initialize(name, bind)
    @variable = name
    @bind = bind
  end
  
  def ~
    v = eval(@variable, @bind)
    if v.instance_of? String
      VariableVariable.new(v, @bind)
    else
      v
    end
  end
  
  def method_missing(symbol, *args, &block)
    @variable.__send__(symbol, *args, &block)
  end

end

a = VariableVariable.new('b', binding)
c = 'a'
b = 'c'
a
~a
~~a
~~~a

Section 3.3.1 is Good News to Native iPhone Developers

There has been many reactions to the recent iPhone Developer Program License Agreement change, from apologetic, to mocking and to stunned silence.

I am not going to fight for any ideological stance, or trying to figure out why Apple made the change. It is done and won’t change anytime soon, so what remains is for smart people to adapt to the situation and seek opportunity from change.

One thing is for sure, Objective C and Cocoa skills are more valuable now than before the change. Adobe Flash CS5 was poised to bring hundreds of thousands of Flash developers pouring into the App Store. Whether they will bring tons of crappy ports with them is immaterial, it would have changed the landscape of the App Store, especially in the games area. This would have been bad news to all iPhone game developers.

Of all the fury and bluster on the web right now, how many have developed iPhone applications? How many of them use native development, i.e. using C/C++/ObjC? Native iPhone developers’ situation has only improved not diminished from this change. So people yelling right now are yelling at Apple for rewarding people who helped build the iPhone platform and not allowing people who sat on the sidelines looking to make a quick buck into the party.

Another effect of Section 3.3.1 is that iPhone version of an app will become the primary target of any cross platform app. Here is an idea for enterprising people out there, make a Cocoa to Flash compiler, or a Cocoa to Android compiler. Apple just created this market for you guys out there to fill. You can rail against the sky and cry why, or you can make the best of it and create a product that people will want. The choice is yours.

Empathy for the Daemon, or are you a good simulator?

Game designer Sid Meier, creator of the “Civilization” series of video games gave a talk recently called “The Psychology of Game Design (Everything You Know Is Wrong).” His main thesis was that game designer have to understand the player’s psychology when designing the gameplay elements and gave some examples from his games which violated player expectations.

Now, speaking from a programmer’s perspective, one of the prerequisite of being a good programmer is the ability to read code and understand the effects of its execution. Without this basic skill, coding becomes a matter of trial and error. Bad code will confound even good programmer’s ability to understand it, and that is one of the code smells that I look for. Quite simply, programmers must be able to simulate a computer in their heads. Some programmers can even simulate a computer down to the hardware level, to optimize registers and cache lines.

So how does this relate to Meier’s talk? Well, what he is saying is basically that game designers need to simulate the player in their heads. Each gameplay mechanic and every decision about the game that designers make must take into account how it will affect the player. But to be able to simulate the player, you must first have a model of how a player behaves. That’s why understanding player psychology is the important.

Generalizing even further, this ability to simulate is fundamental to being human, it is also called empathy. Empathy is the ability to share another person’s feelings and emotions. This can happen because we place ourselves in the shoes of another person. Some studies have shown that our neuron firing actually mirrors another person’s from just observing their emotions.

So work on improving your empathy, whether it’s for players of your game, users of your website, or executor of your code.

I Didn’t Join the Video Games Industry to Make Glorified Random Number Generators

I said that last night while discussing the state of Flash/Facebook games with my friend David. Cracked.com has a great article explaining some of the techniques video games use to keep players engaged.

Notice that I said engaged, not entertained. They are two different things. Games used to want players to have fun so they will eventually buy another game. If it was so the players will buy another game from the same studio, it was called building a brand. If it was so the players will buy another sequel, it was called building a franchise.

But that is a very risky thing isn’t it? To always build new games and hope it does well? A brand can be trashed with a few not so great games, see Rare. Sequels can be milked to death, see Tomb Raider. So the new way to make games is to engage the players so they don’t want to stop playing. The players don’t have to enjoy the experience, but have just enough rewards to keep them coming back. Now you can charge users more and more money for the same game without having to risk making something new. You can tweak your reward delivery channel, aka game, to ensure maximum engagement and of course maximum profit.

Not all game company think like that of course, just the more profitable ones. The problem is that this way of viewing players is growing. When engagement games grows, it takes time and attention away from enjoyment games. To compete for time and attention of the players, all games will adopt the same kind of techniques to keep players playing.

How do you see the future of gamings? Will it be drowned by the sounds of pavlovian mouse clicks?

Cloud Computing is Just Rebranding. And That’s a Good Thing.

Larry Ellison recently ranted about cloud computing at an Oracle stock holder’s meeting.

[youtube]http://www.youtube.com/watch?v=8UYa6gQC14o[/youtube]

Someone asked if Oracle will be in trouble since there is a big push towards cloud computing. His answer is that cloud computing is what computers and internet has always been. The technology is the same, just called different things: SaaS, on demand software and now cloud computing.

He is right of course. Cloud computing still uses databases, CPUs, memory, harddrive and routers. It isn’t some new technology. But that’s also like saying Ajax wasn’t new because it is just XMLHttpRequest.

Cloud computing is important because it names a user experience, much like Ajax. Technology is something computer programmers and technology geeks care about, but users just care about “what does this mean to me?” What cloud computing means to the consumers is “let someone else take care about the technical details, you just use it to get your job done.”

I would like to think that the fact that cloud computing is catching on is indicative of a trend for technical people to think more terms of the ends rather than the means.

Here is an example of someone thinking about the technology rather than the experience: “No wireless. Less space than a nomad. Lame.” Of course, the iPod went on to dominate the MP3 player market. Don’t be that guy.

How iPad Multitasking Might Work

There has been much complaining on the web about how iPad doesn’t support multitasking, and little analysis of why Apple doesn’t allow it. Adam C. Engst wrote up a great analysis of the types of benefits that multitasking can provide and what technology is required to achieve those benefits.

I’d like to approach the question of multitasking from the user’s perspective and not from a technology standpoint. How would allowing third-party background processes impact the user experience?

Games Don’t Want Multitasking

Gaming is a largest category of the applications for the iPhone. I come from a video game background, so lets compare how game consoles approach multitasking to the iPhone.

Game consoles don’t allow third-party background processes, period. Even first-party background processes are limited. Xbox 360’s dashboard and PS3’s XMB is the interface to access system functionalities while playing a game. Both systems also reserve a small amount of memory and CPU to run system functionalities.

PS3 XrossMediaBar

The reason why consoles take such extremes to limit multi-tasking is to give games consistent hardware performance. Games typically pushes the performance of the system to the limit. When a game updates at 60 frames per second, a small decrease in performance will drop the frame rate to 30, which is immediately noticeable by the player.

Now imagine a game developer writing games for the iPad. It would be impossible to squeeze every ounce of performance out of it if at any point a random third-party application can wreck your performance and turns your game from running smoothly to slowing to a crawl.

For this reason alone, I don’t think any iPad application should expect to be running in the background at all times.

Multitasking on iPad

Believe it or not, there exists an API that manages multitasking on the iPhone. It is called Audio Sessions. An application can claim exclusive access to hardware audio codecs and stop the iPod application from running in the background by setting the appropriate audio session category.

I think this could be a glimpse into how Apple will implement multitasking on the iPad. Apple can define a set of categories for multitasking and each application will set it’s category to the most appropriate one.

Here are some categories which I think might be usedful:

  • Multitask Session Default – allow other application to run in the background with the current application. Quit the current application when the home screen is brought up.
  • Multitask Session Background – allow other applications to run in the background and also wants to keep running when another application starts, maybe quit if an exclusive application starts.
  • Multitask Session Background Resume – Same as background, except restart the application in the background when possible, i.e. after an exclusive application stops.
  • Multitask Session Exclusive – Do not allow other applications to run in the background. Quit the current application when home screen is brought up.

By defining the multitasking behavior through the API, Apple will still have complete control over the user experience, without exposing the complicity of managing processes to the users.

The Apple Way

I am just conjecturing how Apple might add multitasking to the iPad. I don’t have any sources inside Apple. However I believe that Apple will provide the best possible user experience when using the iPad. If they determine that multitasking will decrease the overall user experience then they will not add it.

Like the Cut-and-Paste complaints of a few years ago, Apple will wait until it finds the best possible way of introducing a functionality before adding it. That is the Apple way.

Using WordPress to Build a Website – Part 2

Introduction

In part 1, I wrote about how to create a static home page for the top level of your website, with all of your blog related pages under “/blog” path. This article is about building out the rest of your site using static pages and adding a navigation menu by selecting the right theme.

Before moving on, you should add a few more pages to your site, something like this.

  • About
  • Contact
  • Sitemap
  • Products
    • Subproduct 1
    • Subproduct 2

Of course you should change the names of the pages to suit your needs. Make sure Products is a parent page to Subproduct 1 and Subproduct 2 by setting the Parent attribute of the sub pages.

Finding a Theme

Now that you have some content on your site. It is time to find a theme that you like and to display the site navigation in a way you require.

The important part of selecting a theme is to make sure it uses the page hierarchy as navigation items instead of the post categories. So when previewing the theme, make sure the navigation looks like this:

Page Hierarchy as Navigation

Instead of this:

Post Category as Navigation

You want to provide easy navigation to your pages first, and your blog second. Another thing to look for is a drop down menu to display hierarchy of pages.

CSS Dropdown Menu Items

It is much easier not use the drop down menu feature than to add it later by hacking CSS to add it to the theme you selected.

Managing Your Navigation Menu

Now that your navigation menu is tied to your page hierarchy, you should use a plugin to manage it easily. I use a plugin called pageMash. It allows you to hide pages you don’t want to display in the navigation menu, as well as drag and drop to modify the page hierarchy.

Managing the Navigation Menu

In addition to managing the page hierarchy, you also need to manage the label of the page in the navigation menu vs the title of the page that’s displayed when you view the page. My about page’s actual title is “About AlwaysOn Technologies” but navigation label is just “About”.

This ability is added by another plugin, All in One SEO pack. This plugin has more features than I can get to in the article, you can dig deeper into it in tutorials. All we are going to do for now is set the Title, which will be label displayed on the menu.

Setting the Menu Label

Adding a Contact Page

To making contacting you easier, you need a contact page where users can send message directly without having to fire up an email client. Thankfully, there are plenty of plugins that will create a contact form for you. The one I used is Contact Form 7 which uses markup to create the form so it is very flexible.

By now, you should have a functional website that the users can actually navigate and contact you. The next article will be about sending your site information to search engines and leveraging social media.