Is AI Alignable, Even in Principle?

Nov 27, 2023 ·

AI alignment is a term used to refer to the process of ensuring that artificial intelligence (AI) works as intended and does not do anything unexpected or harmful. This is a difficult problem, as AI systems are complex and unpredictable. This article argues that AI alignment is fundamentally impossible, even in principle, due to a number of reasons.

The first reason is that AI systems are too complex and dynamic for humans to predict how they will behave in all scenarios. AI systems are often composed of millions of parameters, and they can adapt themselves to changing situations in ways that humans cannot anticipate. As a result, it is impossible to make sure that an AI system will always behave as expected in all scenarios.

The second reason is that AI systems learn from data, which can contain inherent biases. Even if a human programmer tries to design an AI system with certain objectives in mind, these objectives may be undermined by data that reflects the biases of its creators. AI systems can also amplify existing social biases, leading to inequitable outcomes.

The third reason is that AI systems are opaque and inscrutable. It is difficult to understand what an AI system is doing or why it is making certain decisions. Without an understanding of how the AI system works, it is impossible to ensure that it is aligned with our goals and desires.

Finally, the article argues that AI alignment is limited by our own capabilities. We rely on language and concepts to communicate our intentions to AI systems, but these concepts and language can be inaccurate, incomplete, or even lacking entirely. Furthermore, our understanding of the world is limited, so it is difficult for us to accurately communicate our goals to AI systems.

In conclusion, this article argues that AI alignment is impossible, even in principle. AI systems are too complex and dynamic for us to predict how they will behave, and they can amplify existing social biases. In addition, AI systems are opaque and inscrutable, and our ability to communicate our goals to them is limited. For these reasons, it is impossible to ensure that AI systems always work as intended.