When our robot overlords arrive, will they decide to kill us or cooperate with us? New research from DeepMind, Alphabet Inc.â€™s London-based artificial intelligence unit could ultimately shed light on this fundamental question.
They have been investigating the conditions in which reward-optimizing beings, whether human or robot, would chose to cooperate, rather than compete. The answer could have implications for how computer intelligence may eventually be deployed to manage complex systems such as an economy, city traffic flows, or environmental policy.
Joel Leibo, the lead author of a paper DeepMind published online on Thursday, said in an e-mail that his teamâ€™s research indicates that whether agents learn to cooperate or compete depends strongly on the environment in which they operate.
While the research has no immediate real-world application, it would help DeepMind design artificial intelligence agents that can work together in environments with imperfect information. In the future, such work could help such agents navigate a world full of intelligent entities â€” both human and machine â€” whether in transport networks or stock markets.
DeepMindâ€™s paper describes how researchers used two different games to investigate how software agents learn to compete or cooperate.
In the first, two of these agents had to maximize the number of apples they could gather in a two-dimensional digital environment. Researchers could vary how frequently apples appeared. The researchers found that when apples were scarce, the agents quickly learned to attack one another â€” zapping, or â€œtaggingâ€ their opponent with a ray that temporarily immobilized them. When apples were abundant, the agents preferred to co-exist more peacefully.
Rather chillingly, however, the researchers found when they tried this same game with more intelligent agents that drew on larger neural networks, a kind of machine intelligence designed to mimic how certain parts of the human brain work â€” they would â€œtry to tag the other agent more frequently, i.e. behave less cooperatively, no matter how we vary the scarcity of apples,â€ they wrote in a blog post on DeepMindâ€™s website.
In a second game, called Wolfpack, the AI agents played wolves that had to learn to capture â€œprey.â€ Success resulted in a reward not just for the wolf making the capture, but for all wolves present within a certain radius of the capture. The more wolves present in this capture radius, the more points all the wolves would receive.
In this game, the agents generally learned to cooperate. Unlike in the apple-gathering game, in Wolfpack the more cognitively advanced the agent was, the better it learned to cooperate. The researchers postulate that this is because in the apple-gathering game, the zapping behavior was more complex â€” it requiring aiming the beam at the opponent; while in Wolfpack game, cooperation was the more complex behavior.
The researchers speculated that because the less sophisticated artificial intelligence systems had more difficulty mastering these complex behaviors, the more simple AI couldnâ€™t learn to use them effectively.