Timnit Gebru
Co-Lead, Ethical AI Team · Google · 2020
Forced out after co-authoring a paper on the harms of large language models ('On the Dangers of Stochastic Parrots'). Google objected to publication and demanded she retract or remove her name.
Gebru co-authored a landmark paper examining the risks of large language models — environmental costs, encoded biases, and the illusion of understanding. When Google demanded she retract it or remove her name, she refused and was terminated. Her firing exposed deep tensions between corporate AI labs and the researchers tasked with studying the harms of their own products, and triggered a wave of solidarity resignations from colleagues who saw her ouster as proof that internal ethics work was tolerated only when it did not threaten business interests.
Sources
- We read the paper that forced Timnit Gebru out of Google. Here's what it says.MIT Technology Review
- 'I started crying': Inside Timnit Gebru's last days at GoogleMIT Technology Review
Key Publications
- Datasheets for DatasetsarXivpreprint
This paper proposes that every dataset used in machine learning should be accompanied by a standardized 'datasheet' documenting key information about its creation, composition, intended uses, and limitations, modeled on the datasheets that accompany electronic components in the hardware industry. The authors argue that the lack of documentation around training data contributes to the reproduction of biases, the use of datasets in inappropriate contexts, and the difficulty of auditing AI systems for fairness and accountability. The proposed datasheet template includes questions about the motivation behind dataset collection, the demographic composition of the data, how consent was obtained from data subjects, and what preprocessing or labeling steps were applied. By creating a shared standard for dataset transparency, the paper aims to shift responsibility upstream in the machine learning pipeline and make it easier for researchers and practitioners to make informed decisions about which data to use. The framework has since been adopted or adapted by several major organizations and has influenced broader movements toward documentation standards in AI, including model cards and system cards.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?ACM FAccT 2021paper
This paper argues that the race to build ever-larger language models carries significant and under-examined risks, including the substantial environmental costs of training runs that consume enormous amounts of energy and water. The authors document how large training corpora inevitably encode the biases, stereotypes, and toxic language prevalent on the internet, and demonstrate that these biases are then reproduced and amplified by the models trained on them. They introduce the metaphor of a 'stochastic parrot' to describe how language models generate fluent text by pattern-matching without genuine understanding, creating a dangerous illusion of competence that can mislead users. The paper calls for greater investment in careful data curation, documentation practices, and research into smaller, more efficient models as alternatives to uncritical scaling. Beyond its technical contributions, the paper became a flashpoint in debates about research freedom, corporate influence over AI ethics, and the treatment of dissenting voices within major technology companies after co-author Timnit Gebru's departure from Google.