Bossa Nova Robotics’ co-founder and CTO Sarjoun Skaff was so eager to get started working on a practical application for robots that he started his own company two years before getting his PhD from Carnegie Mellon in 2007.
Since then, engineers at Bossa Nova have been refining the performance of the company’s inventory-taking robots, which Skaff says has been an incredibly intellectually stimulating journey, noting that “Advanced robotics technology is a very complex and multi-faceted problem that you only get to solve by doing it in real life and at scale.”
Fierce Electronics interviewed Skaff about the technology behind the company’s autonomous robots, which earned the company a place on Forbes' 2019 list of most promising AI companies, Skaff also shared key lessons he and his team learned along the way.
FE: Your autonomous robots are designed to assist human workers with inventory management, including the identification of out-ot-stock items and things like misplaced products. But isn’t that what bar codes are supposed to do?
Skaff: If you think about the logistics of getting a product on the shelf at a retail store, it goes through probably a dozen touch points, all which require scanning that bar code to keep track of its whereabouts. You also have a flood of products coming in all at one time, and limited resources, so for retailers it becomes a race against the clock to get that inventory on the shelf in the allocated time.
And when people are rushing, process is the first thing they let go of. A worker might skip scanning an item, which introduces an error in the system. Individually that error may be small, but then multiply that small error by a missed scan happening over and over again, and you wind up with a significant inventory control problem. But interesting that you. mention bar codes, which happens to be one of our secret weapons.
FE: How so?
Skaff: In short, we can use the bar code to infer the identity of a product and that gives us a huge advantage with data collection. You can imagine the size of the image library and the processing involved if we had to identify the product by its packaging.
That being said, we capture more than 20 Gb of data per aisle [we conventionally call one pass on one side of the aisle ‘an aisle’], which is an enormous amount of data! It would be unreasonable to be sending all of this data to the cloud without doing some kind of first pass on it locally.
So, we use full-resolution images only for reading the bar codes (or shelf labels), and we need that resolution because of the thin lines. Working in combination, we use AI for scene understanding and for detecting the presence or absence of the product on the shelf.
We use onboard processing to decode the barcode images at full resolution, compress, stich, and send the images to the cloud for AI processing. We are able to stitch in real time, so as the robot exits an aisle, the entire panorama is already stitched and being uploaded to the cloud for AI processing. Cloud AI processing includes product detection and recognition, and the outcome is a set of data such as which products are out of stock.
FE: What other technology is on board the robot?
Skaff: We have more than 15 cameras just looking at the shelf, and we have more cameras looking around the robot for safety and remote operation. We actually have 3 CPUs and GPUs onboard the robot—so much that we jokingly refer to the robot as a “roving data center.” We have a GPU that can do the AI compute as needed. But we prefer to do compute in the cloud as the cloud can be elastic and cost effective.
For navigation, we augment 2D images with 3D images so we can be more precise at detecting obstacles. Incidentally, we also use 3D images of the shelf as it gives us better details for detecting the absence of presence of a product.
FE: Which is harder – detecting people or products?
Skaff: Two years ago we acquired a company called HawXeye, which is a spinoff of CMU Biometrics Lab. Through that acquisition we acquired talent and IP that is used to for computer vision and facial recognition. What we discovered when we were doing our diligence was that you basically detect products in much the same way you detect faces. You have many of the same challenges, say when a product is not facing forward on the shelf. The thing is that if you can solve for one you can solve for the other. We deployed the technology for product detection and it is working superbly--we have been able to reach parity with the accuracy with which a human can recognize an image.
FE: How challenging is navigation when a robot is sharing the same space with people?
Skaff: There is more to navigation that just avoiding obstacles, you also have to consider the machine’s interaction with people. So, we pepper our controllers with policies that are human-centric—in other words the robot behaves in a way that signals it understands the scene. So, for example, if you jump in front of the robot not only will it stop abruptly, but it will take a step back, signaling, “It’s ok, you don’t have to worry about me bumping into you.”
FE: What are some of the unexpected things you’ve learned during this journey?
Skaff: Autonomous robots are a very complex and multifaceted problem that you only get to solve by doing it in real life and at scale. We did a lot of testing at first 5 and then 50 stores, and at that point we thought things were under control. But when we expanded to 100 stores, we started to see infrequent, surprising things crop up. Even with a PhD in robotics, it turns out that there are very mundane, unexpected things that you encounter that frankly can defeat the smartest navigation algorithm out there.
For example, we had a problem with shopping carts being carelessly discarded in a store, which is typical if you think about it. But in this case some of the carts were blocking access to the robot’s charging station. It’s a disaster for productivity if the robots cannot get charged in a timely fashion. So we came up with a notification strategy whereby the robot calls a team member to come and push the shopping cart out of its way. Maybe in the future we will come up with a way for the robot itself to gently push the shopping cart aside.
Another example involved obstacle detection. When it comes to autonomous navigation, our common understanding of obstacles is that these are things that come from the ground, and we use 3D sensors and infrared lasers to detect them. But in some stores in Florida with skylights, we discovered that the sunlight coming from above was blinding our sensors and causing the robot to interpret the patch of sunlight as a hole in the ground, bringing it to a stop. So, again, we had to develop algorithms to filter out sunlight.
One particular thing we did not grasp beforehand is the rich diversity of fixtures and product presentations that you can encounter in a store. The creativity of merchandisers to attract shoppers is so great that it throws AI researchers for a loop. We’re literally in an arms race with the marketers as they continually dream up promotional displays galore—we’ve seen them place shelves at an angle or put a label on a bar code label smack in the center of the shelf--there are so many ways to display products that just when we think we have seen them all they come up with something else, all to increase sale by a fraction of a percent. And you are not going to mess with that!
FE: Was there one big “A ha!” moment for you?
Skaff: I don’t even know where to start, because there are so many things that we did not expect and that we learned from. I would have to say that one of the biggest revelations was people’s response to robots. In the beginning, we were very concerned that people would reject robots because they would see them as threatening or stealers of jobs. And it was the exact opposite: We were pleasantly surprised to see how workers embraced them as tools that augment their productivity. Similarly, we were surprised to see that shoppers could not care less about seeing autonomous robots moving around the aisles of a store.
Sarjoun Skaff is participating in a panel discussion on AI in Autonomous Technologies and the diabolical challenges of image classification, on Wednesday, August 12, 2020 at 11:30 am EST during Fierce AI Week, a free virtual event. Visit the event website for more details and to register for this event running August 10-12, 2020.