State of Artificial Intelligence in 2017

A Stanford-led group of leading AI thinkers called the AI100 has launched an index that will provide a comprehensive baseline on the state of AI and measure technological progress in the same way the gross domestic product and the S&P 500 index track the U.S. economy and the broader stock market. They have produced a 101 page AI Index 2017 report.

This report aggregates a diverse set of data, makes that data accessible, and includes discussion about what is provided and what is missing. Most importantly, the AI Index 2017 Report is a starting point for the conversation about rigorously measuring activity and progress in AI in the future.

* The number of AI papers produced each year has increased by more than 9x since 1996.

* Introductory AI class enrollment at Stanford has increased 11x since 1996.

* AI conference attendance numbers show that research focus has shifted from symbolic reasoning to machine learning and deep learning

* Despite shifting focus, there is still a smaller research community making steady progress on symbolic reasoning methods in AI.

The performance of AI systems on the object detection task in the Large Scale Visual Recognition Challenge (LSVRC) Competition.

Error rates for image labeling have fallen 2.5% from 28.5% to below 2.5% since 2010.

Despite the difficulty of comparing human and AI systems, it is interesting to catalog credible claims that computers have reached or exceeded human-level performance. Still, it is important to remember that these achievements say nothing about the ability of these systems to generalize. We also note the list below contains many game playing achievements. Games provide a relatively simple, controlled, experimental environment and so are often used for AI research.

Tracking areas that have traditionally lacked concrete measurements may also facilitate a more sober assessment of AI progress. Progress is typically tracked consistently when good progress has been made. As a result, this report may present an overly optimistic picture.

Indeed, chatbot dialog falls far short of human dialog and we lack widely accepted benchmarks for progress in this area. Similarly, while today’s AI systems have far less common sense reasoning than that of a five-year-old child, it is unclear how to quantify this as a technical metric. Expanding the coverage of the report may help correct for this optimistic bias. Additionally, any effort to develop effective reporting metrics in one of these more difficult areas may be a contribution in itself that spurs further progress in the area.