Inside Googles Plan to Build a Catalog of Every Single Thing, Ever

There’s a lot more to Google’s Knowledge Graph than might be apparent from what you see in a casual search.

The ugly truth is that computers don’t know anything. They have no common sense.

This idea had been circulating in Metaweb co-founder John Giannandrea’s head since 1997 when he was working at Netscape and thinking about how to reveal what you did not know you didn’t know on the web. If you were looking at search results for a hiking trail, say, what other hiking trails might you look at? Giannandrea called it “going sideways through the web,” and he loved the idea, even if he couldn’t execute it back then.

Years later, in 2005, Giannandrea teamed up with Danny Hillis and Robert Cook to cofound Metaweb, which had a simple premise: “What if we could make a catalog of all the stuff our computer should know?” Giannandrea told me in a recent interview. “We were interested in building a model of the world. Our computers are remarkably dumb about the stuff that we take for granted. You learn about stuff. You have some context for understanding. Our computers don’t work that way because we don’t have any loaded context.”

With remarkable confidence (hubris?), he and the other founders said to themselves, “Teaching computers all the discrete stuff in the world seems like it should be doable,” so they set out to make a machine-readable catalog of everything in the world.

Last month, their project was finally let loose into the wild as the Google Knowledge Graph, which you now see showing up in your search results on the right of your screen. But there’s a lot more to the creation of the Knowledge Graph than might be apparent from using it in casual searches.

This is one of those human knowledge projects that is ridiculous in scope and possibly in impact. And yet when it gets turned into a consumer product, all we see is a useful module for figuring out Tom Cruise’s height more quickly. In principle, this is both good and bad. It’s good because technology should serve human needs and we shouldn’t worship the technology itself. It’s bad because it’s easy to miss out on the importance of the infrastructure and ideology that are going to increasingly inform the way Google responds to search requests. And given that Google is many people’s default portal to the world of information, even a subtle change in the company’s toolset is worth considering.

And that’s how I found myself on the phone with John Giannandrea discussing mojitos and semantic graphs. “Take the drink called the mojito,” he said. “Mojito has ingredients and mint, rum, ice. We’ll create a catalog entry for that entity for that human concept ‘mojito’ and then we’ll create a connection between the mojito and its ingredients.” The key difference between their catalog and a standard database is that the connection between the mojito and mint is itself an entity, an entity that says, “This thing is an ingredient in this other thing.” The edge between the two nouns contains meaning and that makes all the difference. “We can talk about the representation of knowledge with the knowledge itself,” Giannandrea said. Whoa, Meta! I thought. Hence, Metaweb.

But there’s at least one problem. If you’re going to build a catalog of all common sense things in the world, where do you start? The answer was simply, “Somewhere.” They added bodies of water and bridges, which go over bodies of water, and highways which the bridges are a part of, and the length of those highways and the states through which the highways run, and the capitols of those states, and the populations of those capitols, and the population of the United States, and the population of every country in the world, and the dates in which those countries were founded, and so on and so forth and so on and so forth.