IDAIS-Oxford, 2023

DITCHLEY PARK, UK October 31st

Statement

ENGLISH 中文声明

Coordinated global action on AI safety research and governance is critical to prevent uncontrolled frontier AI development from posing unacceptable risks to humanity.”

Global action, cooperation, and capacity building are key to managing risk from AI and enabling humanity to share in its benefits. AI safety is a global public good that should be supported by public and private investment, with advances in safety shared widely. Governments around the world — especially of leading AI nations — have a responsibility to develop measures to prevent worst-case outcomes from malicious or careless actors and to rein in reckless competition. The international community should work to create an international coordination process for advanced AI in this vein.

We face near-term risks from malicious actors misusing frontier AI systems, with current safety filters integrated by developers easily bypassed. Frontier AI systems produce compelling misinformation and may soon be capable enough to help terrorists develop weapons of mass destruction. Moreover, there is a serious risk that future AI systems may escape human control altogether. Even aligned AI systems could destabilize or disempower existing institutions. Taken together, we believe AI may pose an existential risk to humanity in the coming decades.

In domestic regulation, we recommend mandatory registration for the creation, sale or use of models above a certain capability threshold, including open-source copies and derivatives, to enable governments to acquire critical and currently missing visibility into emerging risks. Governments should monitor large-scale data centers and track AI incidents, and should require that AI developers of frontier models be subject to independent third-party audits evaluating their information security and model safety. AI developers should also be required to share comprehensive risk assessments, policies around risk management, and predictions about their systems’ behaviour in third party evaluations and post-deployment with relevant authorities.

We also recommend defining clear red lines that, if crossed, mandate immediate termination of an AI system — including all copies — through rapid and safe shut-down procedures. Governments should cooperate to instantiate and preserve this capacity. Moreover, prior to deployment as well as during training for the most advanced models, developers should demonstrate to regulators’ satisfaction that their system(s) will not cross these red lines.

Reaching adequate safety levels for advanced AI will also require immense research progress. Advanced AI systems must be demonstrably aligned with their designer’s intent, as well as appropriate norms and values. They must also be robust against both malicious actors and rare failure modes. Sufficient human control needs to be ensured for these systems. Concerted effort by the global research community in both AI and other disciplines is essential; we need a global network of dedicated AI safety research and governance institutions. We call on leading AI developers to make a minimum spending commitment of one third of their AI R&D on AI safety and for government agencies to fund academic and non-profit AI safety and governance research in at least the same proportion.

在人工智能安全研究和治理方面协调一致的全球行动对于防止不受控制的前沿人工智能发展给人类带来不可接受的风险至关重要。”

全球行动、合作和能力建设是管理人工智能风险并使人类分享其利益的关键。人工智能安全是一项全球公共产品,应得到公共和私人投资的支持,并广泛分享安全方面的进步。世界各国政府——尤其是领先的人工智能国家——有责任制定措施,防止恶意或粗心行为者造成最坏的结果,并遏制不计后果的竞争。国际社会应以此为导向,努力建立先进人工智能的国际协调流程。

我们面临着恶意行为者滥用前沿人工智能系统的近期风险,而开发人员集成的当前安全过滤器很容易被绕过。前沿人工智能系统会产生令人信服的错误信息,并且可能很快就有能力帮助恐怖分子开发大规模杀伤性武器。此外,未来的人工智能系统可能完全摆脱人类控制,存在严重风险。即使一致的人工智能系统也可能破坏现有机构的稳定或削弱其权力。综上所述,我们相信人工智能可能在未来几十年对人类构成生存风险。

在国内监管中,我们建议对超过一定能力阈值的模型(包括开源副本和衍生品)的创建、销售或使用进行强制注册,以使政府能够获得对新兴风险的关键且目前缺失的可见性。政府应监控大规模数据中心并跟踪人工智能事件,并应要求前沿模型的人工智能开发者接受独立的第三方审计,评估其信息安全和模型安全性。还应要求人工智能开发人员与相关机构分享全面的风险评估、风险管理政策以及第三方评估和部署后系统行为的预测。

我们还建议划定明确的红线,一旦逾越,将强制要求通过快速、安全的关闭程序立即终止人工智能系统(包括所有副本)。各国政府应合作实例化并保留这种能力。此外,在部署之前以及在最先进模型的培训期间,开发人员应向监管机构证明他们的系统不会跨越这些红线,以使监管机构满意。

先进人工智能达到足够的安全水平还需要巨大的研究进展。先进的人工智能系统必须明显符合设计者的意图以及适当的规范和价值观。它们还必须能够抵御恶意行为者和罕见的故障模式。需要确保这些系统有足够的人为控制。全球研究界在人工智能和其他学科领域的共同努力至关重要;我们需要一个由专门的人工智能安全研究和治理机构组成的全球网络。我们呼吁领先的人工智能开发商将其人工智能研发的三分之一的最低支出承诺用于人工智能安全,并呼吁政府机构以至少相同比例资助学术和非营利性人工智能安全和治理研究。

Signatories

Andrew Yao

Dean of Institute for Interdisciplinary Information Sciences
Tsinghua University
Distinguished Professor-At-Large
The Chinese University of Hong Kong
Professor of Center for Advanced Study
Tsinghua University
Turing Award Recipient

Yoshua Bengio

Professor at the Université de Montréal;
the Founder and Scientific Director of Mila
Quebec AI Institute,
Chair of the International Scientific Report on the Safety of Advanced AI
Turing Award Winner

Stuart Russell

Professor and Smith-Zadeh Chair
in Engineering at the University of California, Berkeley
Founder of Center for Human-Compatible Artificial Intelligence (CHAI)
at the University of California, Berkeley

Ya-Qin Zhang

Chair Professor of AI Science

Tsinghua University

Dean of Institute for AI Industry Research

Tsinghua University (AIR)

Former President of Baidu

Ed Felten

Robert E. Kahn Professor of Computer Science and Public Affairs

Princeton University

Founding Director, Center for Information Technology Policy

Princeton University

Roger Grosse

Associate Professor of Computer Science

University of Toronto

Founding Member

Vector Institute

Gillian Hadfield

Incoming Professor at the School of Government
and Policy and School of Engineering of the Johns Hopkins University

Professor of Law and Strategic Management at the University of Toronto

Sana Khareghani

Professor of Practice in AI

King’s College London

AI Policy Lead

Responsible AI UK

Former Head of UK Government Office for Artificial Intelligence

Dylan Hadfield-Menell

Bonnie and Marty (1964) Tenenbaum Career Development Assistant

Professor of EECS, MIT

Lead, Algorithmic Alignment Group Computer Science and Artificial

Intelligence Laboratory (CSAIL), MIT

AI2050 Early Career Fellow

Karine Perset

Research Scholar

Professor of EECS, MITCentre for the Governance of AI

Dawn Song

Professor of EECS

UC Berkeley

Founder

Oasis Lab

Xin Chen

PhD Student, ETH Zurich

Max Tegmark

Professor

MIT Center for Brains, Minds & Machines

President and Co-founder

Future of Life Institute

Elizabeth Seger

Research Scholar

Centre for the Governance of AI

Yi Zeng

Professor and Director of Brain-inspired Cognitive Intelligence Lab

Institute of Automation, Chinese Academy of Sciences

Founding Director

Center for Long-term AI

HongJiang Zhang

Founding Chairman

Beijing Academy of AI

Yang-Hui He

Fellow

London Institute

Adam Gleave

Founder and CEO, FAR AI

Fynn Heide

Executive Director, Safe AI Forum