An AI Chip to Help Computers Understand Images

A powerful approach to artificial intelligence could be coming to smartphones. Researchers from Purdue University are working to commercialize designs for a chip to help mobile processors make use of the AI method known as deep learning. Although the power of deep learning has inspired companies including Google, Facebook, and Baidu to invest in the technology, so far it has been limited to large clusters of high-powered computers.

When Google developed software that learned to recognize cats from YouTube videos, the experiment required 16,000 processors (see “Self-Taught Software”).

Being able to implement deep learning in more compact and power-efficient ways could lead to smartphones and other mobile devices that can understand the content of images and video, says Eugenio Culurciello, a professor at Purdue working on the project. In December, at the Neural Information Processing Systems conference in Nevada, the group demonstrated that a co-processor connected to a conventional smartphone processor could help it run deep learning software.

The software was able to detect faces or label parts of a street scene. The co-processor’s design was tested on an FPGA, a reconfigurable chip that can be programmed to test a new hardware design without the considerable expense of fabricating a completely new chip.

The prototype is much less powerful than systems like Google’s cat detector, but it shows how new forms of hardware could make it possible to use the power of deep learning more widely. “There’s a need for this,” says Culurciello. “You probably have a collection of several thousand images that you never look at again, and we don’t have a good technology to analyze all this content.”

Devices such as Google Glass could also benefit from the ability to understand the abundant pictures and videos they are capturing, he says. A person’s images and videos might be searchable using text—”red car” or “sunny day with Mom,” for example. Likewise, novel apps could be developed that take action when they recognize particular people, objects, or scenes.

Deep learning software works by filtering data through a hierarchical, multilayered network of simulated neurons that are individually simple but can exhibit complex behavior when linked together (see “Deep Learning”). Computers are inefficient at running those networks because they are very different from conventional software.

Purdue’s co-processor design is specialized to run multilayered neural networks above all else and to put them to work on streaming imagery. In tests, the prototype has proven about 15 times as efficient as using a graphics processor for the same task, and Culurciello believes that improvements to the system could make it 10 times more efficient than it is now.