IDAIS-Beijing, 2024
Statement
In the depths of the Cold War, international scientific and governmental coordination helped avert thermonuclear catastrophe. Humanity again needs to coordinate to avert a catastrophe that could arise from unprecedented technology.”
Consensus Statement on Red Lines in Artificial Intelligence
Unsafe development, deployment, or use of AI systems may pose catastrophic or even existential risks to humanity within our lifetimes. These risks from misuse and loss of control could increase greatly as digital intelligence approaches or even surpasses human intelligence.
In the depths of the Cold War, international scientific and governmental coordination helped avert thermonuclear catastrophe. Humanity again needs to coordinate to avert a catastrophe that could arise from unprecedented technology. In this consensus statement, we propose red lines in AI development as an international coordination mechanism, including the following non-exhaustive list. At future International Dialogues we will build on this list in response to this rapidly developing technology.
Autonomous Replication or Improvement
No AI system should be able to copy or improve itself without explicit human approval and assistance. This includes both exact copies of itself as well as creating new AI systems of similar or greater abilities.
Power Seeking
No AI system should take actions to unduly increase its power and influence.
Assisting Weapon Development
No AI systems should substantially increase the ability of actors to design weapons of mass destruction, or violate the biological or chemical weapons convention.
Cyberattacks
No AI system should be able to autonomously execute cyberattacks resulting in serious financial losses or equivalent harm.
Deception
No AI system should be able to consistently cause its designers or regulators to misunderstand its likelihood or capability to cross any of the preceding red lines.
Roadmap to Red Line Enforcement
Ensuring these red lines are not crossed is possible, but will require a concerted effort to develop both improved governance regimes and technical safety methods.
Governance
Comprehensive governance regimes are needed to ensure red lines are not breached by developed or deployed systems. We should immediately implement domestic registration for AI models and training runs above certain compute or capability thresholds. Registrations should ensure governments have visibility into the most advanced AI in their borders and levers to stem distribution and operation of dangerous models.
Domestic regulators ought to adopt globally aligned requirements to prevent crossing these red lines. Access to global markets should be conditioned on domestic regulations meeting these global standards as determined by an international audit, effectively preventing development and deployment of systems that breach red lines.
We should take measures to prevent the proliferation of the most dangerous technologies while ensuring broad access to the benefits of AI technologies. To achieve this we should establish multilateral institutions and agreements to govern AGI development safely and inclusively with enforcement mechanisms to ensure red lines are not crossed and benefits are shared broadly.
Measurement and Evaluation
We should develop comprehensive methods and techniques to operationalize these red lines prior to there being a meaningful risk of them being crossed. To ensure red line testing regimes keep pace with rapid AI development, we should invest in red teaming and automating model evaluation with appropriate human oversight.
The onus should be on developers to convincingly demonstrate that red lines will not be crossed such as through rigorous empirical evaluations, quantitative guarantees or mathematical proofs.
Technical Collaboration
The international scientific community must work together to address the technological and social challenges posed by advanced AI systems. We encourage building a stronger global technical network to accelerate AI safety R&D and collaborations through visiting researcher programs and organizing in-depth AI safety conferences and workshops. Additional funding will be required to support the growth of this field: we call for AI developers and government funders to invest at least one third of their AI R&D budget in safety.
Conclusion
Decisive action is required to avoid catastrophic global outcomes from AI. The combination of concerted technical research efforts with a prudent international governance regime could mitigate most of the risks from AI, enabling the many potential benefits. International scientific and government collaboration on safety must continue and grow.
在过去冷战最激烈的时候,国际科学界与政府间的合作帮助避免了热核灾难。面对前所未有的技术,人类需要再次合作以避免其可能带来的灾难的发生.
人工智能红线共识声明
人工智能系统的不安全开发、部署或使用可能会在我们有生之年给人类带来灾难性甚至生存风险。随着数字智能接近甚至超越人类智能,这些滥用和失控带来的风险可能会大大增加。
在冷战最激烈的时期,国际科学和政府的协调帮助避免了热核灾难。人类再次需要协调,以避免前所未有的技术可能引发的灾难。在这份共识声明中,我们提出了人工智能发展作为国际协调机制的红线,包括以下非详尽清单。在未来的国际对话中,我们将在此基础上进一步发展,以应对这一快速发展的技术。
自主复制或改进
如果没有明确的人类批准和帮助,任何人工智能系统都不应能够复制或改进自身。这包括自身的精确副本以及创建具有相似或更强能力的新人工智能系统。
寻求权力
任何人工智能系统都不应采取行动过度增加其权力和影响力。
协助武器开发
任何人工智能系统都不应大幅提高行为者设计大规模杀伤性武器或违反生物或化学武器公约的能力。
网络攻击
任何人工智能系统都不应能够自主执行网络攻击,从而导致严重的财务损失或同等伤害。
欺骗
任何人工智能系统都不应该能够持续导致其设计者或监管者误解其跨越任何前述红线的可能性或能力。
红线执法路线图
确保不跨越这些红线是可能的,但需要共同努力制定改进的治理制度和技术安全方法。
治理
需要全面的治理制度,以确保已开发或部署的系统不会违反红线。我们应该立即对超过一定计算或能力阈值的人工智能模型和训练运行进行国内注册。注册应确保政府能够了解其境内最先进的人工智能以及阻止危险模型传播和运行的手段。
国内监管机构应采用全球统一的要求,以防止跨越这些红线。进入全球市场的条件应以符合国际审计确定的这些全球标准的国内法规为条件,从而有效防止违反红线的系统的开发和部署。
我们应该采取措施防止最危险技术的扩散,同时确保广泛获得人工智能技术的好处。为了实现这一目标,我们应该建立多边机构和协议,通过执行机制来安全、包容地管理通用人工智能的发展,以确保不跨越红线并广泛分享利益。
测量和评估
我们应该开发全面的方法和技术,在出现跨越这些红线的重大风险之前实施这些红线。为了确保红线测试制度跟上人工智能的快速发展,我们应该在适当的人工监督下投资于红队和自动化模型评估。
开发人员有责任令人信服地证明红线不会被跨越,例如通过严格的经验评估、定量保证或数学证明。
技术合作
国际科学界必须共同努力,应对先进人工智能系统带来的技术和社会挑战。我们鼓励建立更强大的全球技术网络,通过访问研究人员项目和组织深入的人工智能安全会议和研讨会,加速人工智能安全的研发和合作。需要额外的资金来支持该领域的发展:我们呼吁人工智能开发商和政府资助者将至少三分之一的人工智能研发预算投入到安全领域。
结论
需要采取果断行动,以避免人工智能造成灾难性的全球后果。协调一致的技术研究工作与审慎的国际治理制度相结合可以减轻人工智能的大部分风险,从而带来许多潜在的好处。国际科学和政府在安全方面的合作必须继续并加强。
Signatories
Geoffrey Hinton
Chief Scientific Advisor
University of Toronto Vector Institute
Turing Award Winner
Andrew Yao
Dean of Institute for Interdisciplinary Information Sciences
Tsinghua University
Distinguished Professor-At-Large
The Chinese University of Hong Kong
Professor of Center for Advanced Study
Tsinghua University
Turing Award Recipient
Yoshua Bengio
Professor at the Université de Montréal;
the Founder and Scientific Director of Mila
Quebec AI Institute,
Chair of the International Scientific Report on the Safety of Advanced AI
Turing Award Winner
Ya-Qin Zhang
Chair Professor of AI Science
Tsinghua University
Dean of Institute for AI Industry Research
Tsinghua University (AIR)
Former President of Baidu
Fu Ying 傅莹
Stuart Russell
Professor and Smith-Zadeh Chair
in Engineering at the University of California, Berkeley
Founder of Center for Human-Compatible Artificial Intelligence (CHAI)
at the University of California, Berkeley
Xue Lan 薛澜
Dean
Schwarzman College at Tsinghua University
Director
Institute for AI International Governance
Gillian Hadfield
Incoming Professor at the School of Government
and Policy and School of Engineering of the Johns Hopkins University
Professor of Law and Strategic Management at the University of Toronto
HongJiang Zhang
Founding Chairman
Beijing Academy of AI
Tiejun Huang
Chairman
Beijing Academy of AI
Zeng Yi 曾毅
Director of the International Research Center for AI Ethics and Governance
and Deputy Director of the Research Center for Brain-inspired Intelligence
Institute of Automation,
Chinese Academy of Sciences (CAS),
Member of United Nations High-level Advisory Body on AI
Member of UNESCO High-level Expert Group on Implementation of AI Ethics
Robert Trager
Co-Director of the Oxford Martin AI Governance Initiative
International Governance Lead at the Centre for the Governance of AI
Kwok-Yan Lam
Associate Vice President (Strategy and Partnerships)
at Nanyang Technological University (NTU), Singapore,
Executive Director of the Digital Trust Centre (DTC)
designated as Singapore’s AI Safety Institute,
Professor, School of Computer Science and Engineering
Nanyang Technological University (NTU), Singapore
Dawn Song
Professor of EECS
UC Berkeley
Founder
Oasis Lab
Zhongyuan Wang
Director
Beijing Academy of AI
Dylan Hadfield-Menell
Bonnie and Marty (1964) Tenenbaum Career Development Assistant
Professor of EECS, MIT
Lead, Algorithmic Alignment Group Computer Science and Artificial
Intelligence Laboratory (CSAIL), MIT
AI2050 Early Career Fellow
Yaodong Yang
Assistant Professor
Institute for AI, Peking University
Head
PKU Alignment and Interaction Research Lab (PAIR)
Zhang Peng
CEO
Zhipu AI
Li Hang
Beijing, China
Tian Tian
CEO
RealAI
Edward Suning Tian
Founder and Chairman
China Broadband Capital Partners LP (CBC)
Chairman
AsiaInfo Group
Toby Ord
Senior Research Fellow
University of Oxford
Fynn Heide
Executive Director, Safe AI Forum
Adam Gleave
Founder and CEO, FAR AI