Jun 22, 2016 @ 18:35 |
Rapid progress in machine learning and artificial intelligence (AI) has brought increasing attention to the potential impacts of AI technologies on society. AI technologies are likely to be overwhelmingly useful and beneficial for humanity. AI technology has reached a point where the deployment of such AI systems are practically, if not legally, feasible within years, not decades, but the risks are very high. To address possible safety risks from AI systems researchers published a technical paper – Concrete Problems in AI Safety.
Researchers from Google, OpenAI, Stanford and Berkeley have discussed one such potential impact: the problem of accidents in machine learning systems, defined as unintended and harmful behavior that may emerge from poor design of real-world AI systems. Researchers present a list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function or undesirable behavior during the learning process.
While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. AI Researchers believe it’s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.
The five practical research problems in AI Systems in general circumstances is given below briefly. These are all forward thinking, long-term research questions — minor issues today, but important to address for future systems:
- Avoiding Negative Side Effects: How can we ensure that an AI system will not disturb its environment in negative ways while pursuing its goals, e.g. a cleaning robot knocking over a vase because it can clean faster by doing so?
- Avoiding Reward Hacking: How can we avoid gaming of the reward function? For example, we don’t want this cleaning robot simply covering over messes with materials it can’t see through.
- Scalable Oversight: How can we efficiently ensure that a given AI system respects aspects of the objective that are too expensive to be frequently evaluated during training? For example, if an AI system gets human feedback as it performs a task, it needs to use that feedback efficiently because asking too often would be annoying.
- Safe Exploration: How do we ensure that an AI system doesn’t make exploratory moves with very negative repercussions? For example, maybe a cleaning robot should experiment with mopping strategies, but clearly it shouldn’t try putting a wet mop in an electrical outlet.
- Robustness to Distributional Shift: How do we ensure that an AI system recognizes, and behaves robustly, when it’s in an environment very different from its training environment? For example, heuristics learned for a factory work floor may not be safe enough for an office.
The machine learning research community has already thought quite a bit about most of these problems and many related issues, but researchers think there’s a lot more work to be done.
Reference: Concrete Problems in AI Safety