Take a photo, get an extra dimension FREE!
August 19, 2010
In 2009, I had a problem.
I was in charge of designing software for a robot which would, for 15 seconds, be able to evade other robots on a playing field. The problem:I didn’t know what the other robots would look like.
For a robot, that was a serious issue. Computers are pretty good at finding a specific object in an image, but have trouble when it comes to more abstract tasks like “Identify and route around any obstacles shown in this image”. For humans(and most other predatory animals), that comes effortlessly thanks to our binocular vision, which represents everything we see as a 3D model. You don’t need to know exactly what something is and how large it is to avoid walking into it, because your brain tells you there’s an obstacle in your path.
But a camera transmits only a flat image, with no 3D information. This means that for a computer to plot out an object’s location in 3D space, it needs to know some characteristics of the object–things like distinctive color or shape–as well as its size. Since perceived size decreases the farther the robot is, it can be calibrated to have a sort of depth perception this way. In fact, we had used this exact method on the robot’s targeting system. But when the shape, size, and color of the object is unknown, this method fails.
Ultimately, we got the system to work–sort of. It turned out that robots for the FIRST competition usually had points(banners, warning lights, etc) that had unusually high color saturation(hue, saturation, and brightness are the three factors that can define any visible color). By looking for high-saturation objects, we could find the robots. But the depth problem still remained–our robot couldn’t tell the difference between a robot five feet from it with a small warning light, and one 50 feet away with a huge orange banner.
After the competition, the problem got put on the back burner, as we al refocused on building a new robot that really sucked balls. But when I started working on the hackerhat, the problem took on a new relevance. It’s hard to have an augmented reality system work if it can’t understand the 3D environment.
And, as it turns out, when a camera takes an image, it doesn’t lose all of its 3D information. Here’s an experiment:go to Google Maps, and find a street in satellite view. Then, look at the same street in Street View(just look at the pavement, not the stuff around)
The street as shown on Street View is full of randomness. There are potholes, median lines, different colored patches of pavement, even different textures. But on the satellite view, the street is a featureless , uniform gray stripe. The reason is that the street view camera is closer to the street, and thus, according to the laws of perspective, more pixels are available to capture what the street looks like. More pixels means more details.
So, can you analyze this information to find the distance from the camera? I decided to try it. One of the fundamental techniques in computer vision programming is called “blob recognition“–breaking an image into different “blobs” by identifying sharp transitions in color and luminosity. I decided to use a blob detector to count the number of discrete blobs in each part of an image. Since blob detection fragments an image where a perceptible difference exists, parts of the image with more detail(which are presumably closer to the camera) should contain more blobs.
In other words, more blobs=closer to the camera.
After a few hours of coding, I got this:
The image on the right is a photo taken from my laptop’s webcam. On the left is an automatically generated “distance map” of the image, with bright orange representing a close object, and blue a far-away one. The software correctly identified the range to my head, the couch, the keyboard stand, and the beam in the center of the roof, but miscalculated the distance to the stairs(on the far right of the image).
Related articles by Zemanta
- A robot that identifies doors from their handles (eurekalert.org)
- The Curious Robot (adafruit.com)
- FIRST Championship Draws More Than 10,000 Young People in Three Levels of Robotics Challenges at the Ultimate Celebration of Science and Technology (eon.businesswire.com)