Navigating a city like Manhattan is relatively straightforward, given that all the streets run parallel or at right angles, but all bets are off when it comes to more organic cities like Cambridge, which tend to have more curves and constraints.
MIT’s maze-like Stata Center is a prime example of where the “Manhattan Assumption,” which assumes all planes are parallel or oriented in the same direction, can’t be applied, says Julian Straub, a second-year MIT graduate student. So to make it easier for robots navigate non-standard landscapes, Straub developed an algorithm to identify the major orientations in 3D scenes.
“I wanted to make robots smarter about perceiving their environment and help them organize the things they see,” explained Straub, a candidate for a PhD in computer science with a specialization in artificial intelligence.
Almost all of the newer and more advanced robots have sensors that the algorithm can use to find ‘Manhattan frames,’ he said.
To create his algorithm, Straub used Microsoft Kinect sensors. “The algorithm we came up with infers the number of Manhattan frames based on what the Kinect sensor sees—or what kind of data you get from it,” he said.
For example, he ran the algorithm on a point cloud (a dataset of points in a coordinate system) of Cambridge near Kendall Square, and it inferred three major directions: Boston, Harvard, and the Charles River shoreline.
“If I look at the point cloud I see two: Harvard and Boston,” Straub said. “It’s cool that the algorithm found the one leading to the Charles because even though there are buildings nearby, as a human observer, it was hard to pick them out from the point cloud.”
In June, Straub and his co-authors and advisors, John W. Fisher III, Guy Rosman, Oren Freifeld, and John J. Leonard, will present a paper they wrote on the algorithm at the IEEE Conference on Computer Vision and Pattern Recognition. After that he will work on building models on top of the Manhattan frames that will allow for a higher level of reasoning about the content of scenes beyond just directions.
“We have all these awesome robots out there, but they’re not doing anything in our environment because it is very complex and dynamic and there’s a lot of uncertainty,” he said. “Even if it’s just vacuuming or cleaning your house…better perception will help.”