Better AI research depends on benchmarks, Intel guru says

The head of the Intel’s Neuromorphic Computing Lab wants a standard set of academic and industry benchmarks to track research progress for cloud and AI work. He offered up a few starting points towards such benchmarks in a recent scholarly article in Nature Machine Intelligence.

It’s a big ask, but not all that different from what goes in most of science: Why go to the moon? Why split the atom? Why build autonomic vehicles? Why even have cars, aren’t horses good enough? For Pete’s sake, why invent the wheel? What really is the point of an internet? And so on…

For AI and neuromorphic computing, the big questions might well be: What’s the value or purpose of a computer that can beat somebody at chess or Texas Hold’em or Jeopardy or Go?

And, better: Isn’t neuromorphic research supposed to create a computer that invents its own version of Go (which probably no human would even try to beat)?  

Even better, consider the songwriter. Would a neuromorphic algorithm ever be creative enough to write the music and lyrics of a country song, then perform it over a speech engine with the right vocal twang and activate a robot to play a guitar and make the right moves and facial gestures?

Scientists are always asking such things not only to get funding, but to be able to point the global body of research in the right direction or at least a similar direction. That’s where a set of benchmarks comes in.

“Efficient progress [in neuromorphic computing] critically depends on embracing the right benchmarks and measuring progress in order to provide the clearest possible view of the road ahead,” wrote Intel’s Mike Davies in the recent Nature article.

“The neuromorphic field needs to focus more on principles and rigor, less on open-ended exploration and mapping speculative mechanistic features to silicon,” he added. “Steady progress to real-world value depends on quantitative metrics, discipline and informed prioritization.”

Davies is discussing his thoughts about benchmarking at an Intel workshop in Graz, Austria, this month.

In his Nature piece, Davies noted that there’s not so far even a standardized language for neuromorphic programming, such as C for SPECint, a benchmark specification for CPU processing power. Speaking of SPECint, he suggested neuromorphic computing needs a comprehensive suite of algorithms that are analogous to SPECint or MLPerf, a group of benchmarks for measuring performance of machine language hardware, software and services.

He suggested the industry needs a suite of benchmarks that he termed SpikeMark to “evaluate the relative features, flexibility, performance and efficiency of different neuromorphic platforms. It would include both applications suitable for real-world use as well as representative algorithmic primitives with minimal standalone value.” 

What’s more, he came up with a list of 10 possible workloads for neuromorphic systems that could become a standardized way to measure and test the systems.

His ideas include: detecting hand gestures from a standard camera dataset; solving Sudoku and map coloring problems; controlling a modeled robotic arm in a nonlinear way. (Sadly, there was no mention of having a computer write a good country song and perform it through a robot.)

Davies showed in his Nature piece that he recognizes the field of neuromorphic computing is still in its early life and faces criticism for offering empty promises. The use of comprehensive benchmarks “has the potential to move neuromorphic solutions from unsubstantiated promise to mainstream technology,” he wrote.

Beyond benchmarking tests, Davies suggested that neuromorphic systems might better engage in some grander challenges such as playing the game of foosball, “which requires quick predictive responses to erratic ball motion, a good match for emerging event-based cameras with neuromorphic processing.” A competitive neuromorphic agent needs to anticipate an opponent’s moves with reference to winning strategies, Davies added.

He noted there is already a challenge underway from Western Sydney University to use foosball as a neuromorphic computing challenge. Several groups are pursuing the project.

Even so, Davies doesn’t see playing foosball as the grandest challenge for AI, which he described as systems that exhibit “artificial creativity and generative intelligence” and are “capable of synthesizing new knowledge by recognizing novel relationships between diverse semantic entities, a hallmark of human intelligence.”

To get to that ambitious level will be “large and costly,” Davies said, but first comes benchmarking and finding ways to measure progress.

RELATED: Report: neural network market to grow 20.5% through 2024