Human Compatible, book by Stuart Russell | Artificial Intelligence and the Problem of Control

Explains why the creation of a super-intelligent artificial intelligence could be humanity’s final act.

Stuart Russell is a professor of computer science at the University of California, Berkeley. A leading AI researcher, Russell has served as vice-chair of the World Economic Forum’s Council on AI and Robotics and as an advisor to the UN regarding arms control.

Rethink your fundamental assumptions about AI.

Artificial intelligence will be the defining technology of the future. Already, AI is rapidly pervading all levels of society: individuals willfully bring AI into their homes to help them organize their daily lives, city councils and corporations employ AI to help optimize their services, and states take advantage of AI to undertake large scale surveillance and social engineering campaigns. But as AI becomes more intelligent – and our social systems come more and more to depend on it – the threat presented by out-of-control AI becomes more dire.The risks and downsides of new technologies are far too often left unexplored, as scientists and engineers fixate on their feverish quest to realize the utopias of the future. In fact, many AI experts and corporate high-ups even downplay the risks of AI out of fear of being more strictly regulated.

This book attempt to remedy this imbalance. The question of how to control AI and mitigate its more disastrous consequences is the biggest question facing humanity today, and it’s precisely this question we’ll explore.

We need several breakthroughs in software before AI surpasses human intelligence.

The most important breakthrough we need is in the comprehension of language. Most of today’s intelligent speech recognition AI are based on canned responses and have trouble interpreting nuances in meaning. That’s why you get stories of smartphone personal assistants responding to the request ‘call me an ambulance’ with ‘ok, from now on, I’ll call you Ann Ambulance.’ Genuinely intelligent AI will need to interpret meaning based not just on the words said but on their context and tone as well. We can never really say when conceptual breakthroughs will take place. But one thing is for sure – we shouldn’t underestimate human ingenuity.Consider the following example. In 1933, the distinguished nuclear physicist Ernest Rutherford announced at a formal address that harnessing nuclear energy was impossible. The very next day, the Hungarian physicist Leó Szilárd outlined the neutron-induced nuclear chain reaction, essentially solving the problem. We don’t yet know whether superintelligence – intelligence beyond human abilities – will emerge soon, later or not at all. But it’s still prudent to take precautions, just as it was when designing nuclear technology.

We’ve been operating under a misguided conception of intelligence.

In the current paradigm of AI design, an AI’s intelligence is measured simply by how well it can achieve a pre-given objective. The big flaw in this approach is that it’s extremely difficult to specify objectives that will make an AI behave the way we want it to. Pretty much any objective we come up with is liable to produce unpredictable, and potentially very harmful, behavior.This problem is known as the King Midas problem, named after the fabled king who wished that everything he touched would turn to gold. What he didn’t realize was that this included the food he ate, and even his own family members. This ancient tale is a perfect example of how a poorly specified objective can end up causing more strife than good.The danger from unpredictable behavior increases as AI becomes more intelligent and wields greater power. The consequences could even present an existential threat to humanity.

Instead of just intelligent machines, we should be designing beneficial machines.

There are three principles that designers should follow if they’re to make beneficial AI.


The first principle is that AI should only have one objective, which is the maximal fulfillment of human preferences. The author calls this the altruism principle. It ensures an AI will always place human preferences above its own.


The second principle is that the AI should initially be uncertain about what those preferences are. This is the humbleness principle. The idea here is that an uncertain AI will never fixate on a single objective, but will change its focus as new information comes in.


The third and final principle for making beneficial AI is that their ultimate source of information about human preferences should be human behavior. This is called the learning principle. It ensures that an AI will always remain in a direct and sustained relationship of learning with humans. It means an AI will become more useful to a person over time as it gets to know her better.

AI is going to make life less secure for everyone

The Stasi of former East Germany were one of the most effective and repressive intelligence agencies ever to have existed. They kept files on the majority of East German households, listening to their phone calls, reading their letters, and even placing hidden cameras within their homes. This was all done by humans and written on paper, requiring a vast bureaucracy and massive storage units containing literally billions of physical paper records.Just imagine what the Stasi could have done with AI. With superintelligent AI, it would be possible to monitor everyone’s phone calls and messages automatically. People’s daily movements could also be tracked, using surveillance cameras and satellite data. It would be as though every person had their own operative watching over them 24 hours a day.The key message here is: AI is going to make life less secure for everyone.AI could lead to yet other dystopias. This includes the Infopocalypse – the catastrophic failure of the marketplace of ideas to produce the truth. Superintelligent AI will be capable of manufacturing and distributing false information without any human input. They’ll also be able to target specific individuals, altering their information diet strategically to manipulate their behavior with surgical accuracy.To a large extent, this is already happening. Content selection algorithms used by social media platforms – ostensibly designed to predict people’s preferences – end up changing those preferences by providing them only a narrow selection of content. In practice, this means users are pushed to become more and more extreme in their political views. Arguably, even these rudimentary forms of artificial intelligence have already caused great harm, entrenching social division and proliferating hate.While the Infopocalypse is still in its infant stage, the next dystopia is well in the making. This is the state of constant fear caused by autonomous weapons technology.Autonomous weapons – machines that seek out and neutralize targets by themselves – have already been developed. Such weapons identify targets based on information like skin color, uniform, or even exact faceprints. Miniature drones called Slaughterbots are already being primed to search for, locate, and neutralize specific individuals. In 2016, the US air force demonstrated the deployment of 103 slaughterbot drones. They described the drones as a single organism sharing one distributed brain, like a swarm in nature.The US is only one of many nations currently building – or already using – automated weapon technology. As autonomous weapons come to displace conventional human warfare, all of our lives will become less secure, since anyone will be target-able – no matter where they are in the world.

Mass automation will either liberate humanity’s potential or debilitate it.

The truth is, in the long run, AI is likely to automate away almost all existing human labor. This will not only affect low-skilled work like truck driving. As we saw earlier, even highly-skilled professionals like doctors, lawyers, and accountants are at risk.

This is a genuine concern. Until now, the only way to sustain our civilization has been to pass knowledge, from one human to another and over successive generations. As technology develops, we increasingly hand that knowledge and expertise over to a machine that can do the task for us.Once we lose the practical incentive to pass knowledge onto other humans, that knowledge and expertise will wither. If we’re not careful, we could become an enfeebled species, utterly dependent on the machines that ostensibly serve us.

Final thoughts:

The way we currently design AI is fundamentally flawed. We’re designing AI to be intelligent but not necessarily to have humanity’s best interests at heart. We therefore need to make the fulfillment of human goals AI’s only objective. If we can successfully control superintelligent AI, we’d be able to harness its immense power to advance our civilization and liberate humanity from servitude. But If we fail, we’re in danger of losing our autonomy, as we become increasingly subject to the whims of a superior intelligence.

Thanks for reading!