Dario Amodei

VP of Research · OpenAI · 2020

Left over disagreements about scaling AI without adequate safety research. Co-founded Anthropic to pursue a safety-first approach to AI development.

Safety Deprioritization Alignment Research Gaps AGI Risk Underestimation

Sources

Anthropic's CEO says why he quit his job at OpenAI to start a rivalYahoo Finance / Fortune
Anthropic Business Breakdown & Founding StoryContrary Research

Key Publications

Concrete Problems in AI Safety
arXivJun 2016preprint
This paper identifies five practical research problems that arise when AI systems operate in the real world: avoiding negative side effects on the environment, preventing reward hacking where agents exploit loopholes in their objective functions, enabling scalable oversight so humans can supervise systems even when they cannot evaluate every action, ensuring safe exploration so agents do not take catastrophic actions while learning, and handling distributional shift when deployment conditions differ from training. The authors ground each problem in concrete scenarios ranging from cleaning robots to autonomous vehicles, making abstract alignment concerns tangible for the machine learning research community. By framing AI safety as a set of well-defined engineering challenges rather than a philosophical debate, the paper helped legitimize the field and provided a shared research agenda that has influenced subsequent work at major labs. It remains one of the most widely cited papers in AI safety and served as an early signal that researchers inside leading AI organizations took these risks seriously enough to dedicate resources to solving them.
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation
arXivFeb 2018report
This report surveys the landscape of threats that emerge when increasingly capable AI systems are deliberately misused, organizing risks across three domains: digital security, physical security, and political security. In the digital realm, the authors warn that AI could automate the discovery of software vulnerabilities and craft highly targeted phishing attacks at scale. In the physical domain, they discuss how autonomous drones and other robotic systems could be weaponized, while in the political sphere they highlight the potential for AI-generated disinformation campaigns, surveillance, and manipulation of public opinion. The authors bring together experts from academia, civil society, and industry to propose interventions including responsible disclosure norms, technical safeguards, and policy frameworks. The paper was among the first comprehensive efforts to map out the dual-use risks of AI and has shaped ongoing debates about publication norms, export controls, and the responsibilities of AI developers to anticipate misuse.

Share on X Share on LinkedIn

← Back to all profiles