IDAIS-Oxford, 2023
Statement
Coordinated global action on AI safety research and governance is critical to prevent uncontrolled frontier AI development from posing unacceptable risks to humanity.”
Global action, cooperation, and capacity building are key to managing risk from AI and enabling humanity to share in its benefits. AI safety is a global public good that should be supported by public and private investment, with advances in safety shared widely. Governments around the world — especially of leading AI nations — have a responsibility to develop measures to prevent worst-case outcomes from malicious or careless actors and to rein in reckless competition. The international community should work to create an international coordination process for advanced AI in this vein.
We face near-term risks from malicious actors misusing frontier AI systems, with current safety filters integrated by developers easily bypassed. Frontier AI systems produce compelling misinformation and may soon be capable enough to help terrorists develop weapons of mass destruction. Moreover, there is a serious risk that future AI systems may escape human control altogether. Even aligned AI systems could destabilize or disempower existing institutions. Taken together, we believe AI may pose an existential risk to humanity in the coming decades.
In domestic regulation, we recommend mandatory registration for the creation, sale or use of models above a certain capability threshold, including open-source copies and derivatives, to enable governments to acquire critical and currently missing visibility into emerging risks. Governments should monitor large-scale data centers and track AI incidents, and should require that AI developers of frontier models be subject to independent third-party audits evaluating their information security and model safety. AI developers should also be required to share comprehensive risk assessments, policies around risk management, and predictions about their systems’ behaviour in third party evaluations and post-deployment with relevant authorities.
We also recommend defining clear red lines that, if crossed, mandate immediate termination of an AI system — including all copies — through rapid and safe shut-down procedures. Governments should cooperate to instantiate and preserve this capacity. Moreover, prior to deployment as well as during training for the most advanced models, developers should demonstrate to regulators’ satisfaction that their system(s) will not cross these red lines.
Reaching adequate safety levels for advanced AI will also require immense research progress. Advanced AI systems must be demonstrably aligned with their designer’s intent, as well as appropriate norms and values. They must also be robust against both malicious actors and rare failure modes. Sufficient human control needs to be ensured for these systems. Concerted effort by the global research community in both AI and other disciplines is essential; we need a global network of dedicated AI safety research and governance institutions. We call on leading AI developers to make a minimum spending commitment of one third of their AI R&D on AI safety and for government agencies to fund academic and non-profit AI safety and governance research in at least the same proportion.
在人工智能安全研究和治理方面协调一致的全球行动对于防止不受控制的前沿人工智能发展给人类带来不可接受的风险至关重要。”
全球行动、合作和能力建设是管理人工智能风险并使人类分享其利益的关键。人工智能安全是一项全球公共产品,应得到公共和私人投资的支持,并广泛分享安全方面的进步。世界各国政府——尤其是领先的人工智能国家——有责任制定措施,防止恶意或粗心行为者造成最坏的结果,并遏制不计后果的竞争。国际社会应以此为导向,努力建立先进人工智能的国际协调流程。
我们面临着恶意行为者滥用前沿人工智能系统的近期风险,而开发人员集成的当前安全过滤器很容易被绕过。前沿人工智能系统会产生令人信服的错误信息,并且可能很快就有能力帮助恐怖分子开发大规模杀伤性武器。此外,未来的人工智能系统可能完全摆脱人类控制,存在严重风险。即使一致的人工智能系统也可能破坏现有机构的稳定或削弱其权力。综上所述,我们相信人工智能可能在未来几十年对人类构成生存风险。
在国内监管中,我们建议对超过一定能力阈值的模型(包括开源副本和衍生品)的创建、销售或使用进行强制注册,以使政府能够获得对新兴风险的关键且目前缺失的可见性。政府应监控大规模数据中心并跟踪人工智能事件,并应要求前沿模型的人工智能开发者接受独立的第三方审计,评估其信息安全和模型安全性。还应要求人工智能开发人员与相关机构分享全面的风险评估、风险管理政策以及第三方评估和部署后系统行为的预测。
我们还建议划定明确的红线,一旦逾越,将强制要求通过快速、安全的关闭程序立即终止人工智能系统(包括所有副本)。各国政府应合作实例化并保留这种能力。此外,在部署之前以及在最先进模型的培训期间,开发人员应向监管机构证明他们的系统不会跨越这些红线,以使监管机构满意。
先进人工智能达到足够的安全水平还需要巨大的研究进展。先进的人工智能系统必须明显符合设计者的意图以及适当的规范和价值观。它们还必须能够抵御恶意行为者和罕见的故障模式。需要确保这些系统有足够的人为控制。全球研究界在人工智能和其他学科领域的共同努力至关重要;我们需要一个由专门的人工智能安全研究和治理机构组成的全球网络。我们呼吁领先的人工智能开发商将其人工智能研发的三分之一的最低支出承诺用于人工智能安全,并呼吁政府机构以至少相同比例资助学术和非营利性人工智能安全和治理研究。
Signatories
Andrew Yao
Dean of Institute for Interdisciplinary Information Sciences
Tsinghua University
Distinguished Professor-At-Large
The Chinese University of Hong Kong
Professor of Center for Advanced Study
Tsinghua University
Turing Award Recipient
Yoshua Bengio
Professor at the Université de Montréal;
the Founder and Scientific Director of Mila
Quebec AI Institute,
Chair of the International Scientific Report on the Safety of Advanced AI
Turing Award Winner
Stuart Russell
Professor and Smith-Zadeh Chair
in Engineering at the University of California, Berkeley
Founder of Center for Human-Compatible Artificial Intelligence (CHAI)
at the University of California, Berkeley
Ya-Qin Zhang
Chair Professor of AI Science
Tsinghua University
Dean of Institute for AI Industry Research
Tsinghua University (AIR)
Former President of Baidu
Ed Felten
Robert E. Kahn Professor of Computer Science and Public Affairs
Princeton University
Founding Director, Center for Information Technology Policy
Princeton University
Roger Grosse
Associate Professor of Computer Science
University of Toronto
Founding Member
Vector Institute
Gillian Hadfield
Incoming Professor at the School of Government
and Policy and School of Engineering of the Johns Hopkins University
Professor of Law and Strategic Management at the University of Toronto
Sana Khareghani
Professor of Practice in AI
King’s College London
AI Policy Lead
Responsible AI UK
Former Head of UK Government Office for Artificial Intelligence
Dylan Hadfield-Menell
Bonnie and Marty (1964) Tenenbaum Career Development Assistant
Professor of EECS, MIT
Lead, Algorithmic Alignment Group Computer Science and Artificial
Intelligence Laboratory (CSAIL), MIT
AI2050 Early Career Fellow
Karine Perset
Research Scholar
Professor of EECS, MITCentre for the Governance of AI
Dawn Song
Professor of EECS
UC Berkeley
Founder
Oasis Lab
Xin Chen
PhD Student, ETH Zurich
Max Tegmark
Professor
MIT Center for Brains, Minds & Machines
President and Co-founder
Future of Life Institute
Elizabeth Seger
Research Scholar
Centre for the Governance of AI
Yi Zeng
Professor and Director of Brain-inspired Cognitive Intelligence Lab
Institute of Automation, Chinese Academy of Sciences
Founding Director
Center for Long-term AI
HongJiang Zhang
Founding Chairman
Beijing Academy of AI
Yang-Hui He
Fellow
London Institute
Adam Gleave
Founder and CEO, FAR AI
Fynn Heide
Executive Director, Safe AI Forum