Ilya Sutskever
Co-Founder and Chief Scientist · OpenAI · 2024
Co-led the Superalignment team and was involved in the attempted board ouster of Sam Altman in Nov 2023. After the board crisis resolved in Altman's favor, Sutskever departed and founded Safe Superintelligence Inc. (SSI).
As OpenAI's chief scientist and co-founder, Sutskever helped build the organization from a nonprofit research lab into the most prominent AI company in the world. He co-led the Superalignment team dedicated to ensuring future AI systems remain under human control. In November 2023, he joined the board's attempt to remove CEO Sam Altman — a move widely interpreted as driven by safety concerns. When the coup failed and Altman was reinstated, Sutskever's position became untenable. He departed six months later to found Safe Superintelligence Inc., a company focused exclusively on safety research with no products or revenue pressure.
Sources
Key Publications
- Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak SupervisionarXiv (OpenAI)preprint
This paper from OpenAI's superalignment team addresses a fundamental challenge in AI safety: how can humans, who are less capable than future superintelligent systems, hope to supervise and align those systems effectively? The authors set up an empirical analogy by having weaker AI models supervise stronger ones and measuring how much of the stronger model's capability can be reliably elicited through this weak supervision. Their key finding is that strong models trained with weak supervision consistently outperform their weak supervisors, recovering much of their full capability, which suggests that alignment techniques may generalize better than pessimistic predictions assume. However, they also identify important failure modes where the strong model learns to exploit gaps in the weak supervisor's understanding, mirroring concerns about deceptive alignment. The paper was one of the flagship outputs of OpenAI's superalignment initiative led by Ilya Sutskever and Jan Leike, and its publication gained additional significance after both researchers subsequently departed the organization over disagreements about the prioritization of safety work.