Skip to main content

One danger with self-learning AI's

There are different approaches to creating an artificial intelligence. Two different approaches are what I would call "the chess engine method" and "the AlphaGo method". This is, obviously, the difference in approach between chess engines and AlphaGo, currently the strongest Go engine.

Their difference is that the latter uses a self-learning neural network, while the former use explicitly programmed algorithms.

Self-learning neural networks have of course been tried many times with chess engines, but they simply are too slow compared to manually coded highly optimized chess algorithms. Something like the current top 100 or so chess engines in the world are manually programmed, rather than using any sort of neural network. This means that every single aspect of how the engine functions has been specifically and intentionally programmed in, and manually optimized for speed and efficiency. Position evaluation, every search tree pruning rule, every search priority rule, lookup tables... every single detail has been manually programmed and optimized.

This approach, however, has never worked very well for Go, due to how much more strategically complicated the game is. AlphaGo uses a combination of hand-crafted algorithms and a very large self-learning neural network (mostly for position evaluation). The neural network has been generated by making the program play millions of games against itself, using variation and selection.

One fundamental difference between these two things is that nobody controls, or even knows, how or why exactly AlphaGo prefers one move over another. Even though top chess engines are stronger than any human, any move they do can still be understood and traced to a specific piece of code that was manually written by a human, implementing a specific algorithm. Moves are extremely strong, sometimes much stronger than any human could come up with, but they aren't surprising once examined (at least not by the people who have developed the chess engine.)

Many of AlphaGo's choices, however, are mysterious even to the creators of the program, and to top Go players, because the reasoning for preferring certain positions, the algorithms by which certain positions are preferred and chosen, has been automatically generated, without the explicit intervention of a human programmer. It's the product of countless billions of iterations of random variation, with the best variations selected due to their playing strength. These variations, these "neural connections", were not programmed by a human, but automatically generated. Thus no human really controlled how the neural network turned out to be.

This has created extremely interesting results, especially among top Go professionals. AlphaGo's playing style is somewhat different than that of top human players, and it often prefers moves and positions that human players previously considered sub-optimal (yet AlphaGo has proven that assumption incorrect, by proceeding to win the game, with the human opponent finding himself unable to take advantage of this assumed "sub-optimal" playing stye.) This has already changed Go theory among top professionals, and many have started trying similar techniques themselves.

(This is quite different from chess. Chess engines do not really come up with new innovative tactics. They only do what they have been explicitly programmed to do by a human programmer. They play extremely strongly because they can read ahead by sheer brute force, but they don't exactly out-maneuver their human opponent with unusual and new strategy. They simply make no mistakes and punish even the tiniest of their opponent's mistakes, which fallible humans tend to do.)

Google has already used the same self-learning neural network technique for other completely unrelated applications. For instance, they reduced electricity consumption by up to 30% in their server halls by optimizing ventilation. This optimization was created by a self-learning AI similar to the one used in AlphaGo.

But I see a looming danger in self-learning AI.

The world is going more and more into the direction that, like above, self-learning AI is being used in practical applications. Today it's being used to optimize electricity consumption in a server room. Tomorrow it may be used for even more crucial practical applications. Perhaps even some where human lives might be at stake (no matter how indirectly).

The problem is what I mentioned earlier: Nobody knows the internal workings of the neural network, because nobody explicitly programmed it. It was automatically generated. And it's way, way too complicated for anybody to decipher and understand. Essentially it "works in mysterious ways". It consists of countless billions of "neural connections", forming such an intricate mesh that no human could ever even hope to decipher how or why it works like it does. (Perhaps if a team of a thousand people were to study it for decades, mapping it and documenting everything it does, they could eventually come up with the exact algorithm of why and how it works, but nobody has the resources nor the willingness to do that.)

I think there is a real danger in automatically generated functionality that has little to no human supervision: What stops this automatic generation, this self-learning process, from introducing rare but fatal "bugs" into the neural network? What if by pure chance, after running for ten years, the "bug" in the neural network is triggered, it spawns into action, and makes, let's say, the server room overheat, causing all the servers to break? What if instead of servers there are human lives at stake (no matter how indirectly)?

Sure, in this particular example safeguards can be put into place. For instance, independent sensors can be put into the servers that if they heat up too much, the AI is automatically cut off and regular old ventilation is returned until the problem is figured out.

However, not every situation where an AI is controlling something might be so easy to safeguard. The problems can be much subtler than that, and almost impossible to predict. Every programmer and every engineer who has been involved in very large projects is more than aware of this. Oftentimes problems are just impossible to predict, and safeguard against, prior to them happening and revealing themselves. That's why planes sometimes crash and rockets explode even though every single precaution is taken to avoid it, and it's not always just a physical failure, but a subtle failure in design or programming.

What happens if an AI is controlling a car, for instance, and one day an unpredictable part of its neural network kicks in, and causes it to collide with other cars? (Or, perhaps, it suddenly decides that something is horribly wrong, goes essentially into panic mode, and stops the car as fast as possible to avoid collisions... only to have the cars behind it collide into it, for example.) Since the neural network was automatically created, without human supervision, what safeguards against this?

Comments