IDAIS-Beijing, 2024

BEIJING, CHINA March 10th-11th

International scientists meet in Beijing to discuss extreme AI risks, recommend red lines for AI development and international cooperation

Leading global AI scientists convened in Beijing for the second International Dialogue on AI Safety (IDAIS-Beijing), hosted by the Safe AI Forum in collaboration with the Beijing Academy of AI (BAAI). During the event, computer scientists including Turing Award winners Yoshua Bengio, Andrew Yao, and Geoffrey Hinton and the Founding & current BAAI Chairmans HongJiang Zhang and Huang Tiejun worked with governance experts such as Tsinghua professor Xue Lan and University of Toronto professor Gillian Hadfield to chart a path forward on international AI safety.

The event took place over two days at the Aman Summer Palace in Beijing and focused on safely navigating the development of Artificial General Intelligence (AGI) systems. The first day involved technical and governance discussions of AI risk, where scientists shared research agendas in AI safety but also potential regulatory regimes. The discussion culminated in a consensus statement recommending a set of red lines for AI development to prevent catastrophic and existential risks from AI. In the consensus statement, the scientists advocate for prohibiting development of AI systems that can autonomously replicate, improve, seek power or deceive their creators, or those that enable building weapons of mass destruction and conducting cyberattacks. Additionally, the statement laid out a series of measures to be taken to ensure those lines are never crossed. The full statement can be read below. 

On the second day, the scientists met with senior Chinese officials and CEOs. The scientists presented the red lines proposal and discussed existential risks from artificial intelligence, and officials expressed enthusiasm about the consensus statement. Discussions focused on the necessity of international cooperation on this issue.

Statement

ENGLISH 中文声明

In the depths of the Cold War, international scientific and governmental coordination helped avert thermonuclear catastrophe. Humanity again needs to coordinate to avert a catastrophe that could arise from unprecedented technology.

Read the full statement

 

Consensus Statement on Red Lines in Artificial Intelligence

Unsafe development, deployment, or use of AI systems may pose catastrophic or even existential risks to humanity within our lifetimes. These risks from misuse and loss of control could increase greatly as digital intelligence approaches or even surpasses human intelligence.

In the depths of the Cold War, international scientific and governmental coordination helped avert thermonuclear catastrophe. Humanity again needs to coordinate to avert a catastrophe that could arise from unprecedented technology. In this consensus statement, we propose red lines in AI development as an international coordination mechanism, including the following non-exhaustive list. At future International Dialogues we will build on this list in response to this rapidly developing technology.

Autonomous Replication or Improvement

No AI system should be able to copy or improve itself without explicit human approval and assistance. This includes both exact copies of itself as well as creating new AI systems of similar or greater abilities.

Power Seeking

No AI system should take actions to unduly increase its power and influence.

Assisting Weapon Development

No AI systems should substantially increase the ability of actors to design weapons of mass destruction, or violate the biological or chemical weapons convention.

Cyberattacks

No AI system should be able to autonomously execute cyberattacks resulting in serious financial losses or equivalent harm.

Deception

No AI system should be able to consistently cause its designers or regulators to misunderstand its likelihood or capability to cross any of the preceding red lines.

Roadmap to Red Line Enforcement

Ensuring these red lines are not crossed is possible, but will require a concerted effort to  develop both improved governance regimes and technical safety methods.

Governance

Comprehensive governance regimes are needed to ensure red lines are not breached by developed or deployed systems. We should immediately implement domestic registration for AI models and training runs above certain compute or capability thresholds. Registrations should ensure governments have visibility into the most advanced AI in their borders and levers to stem distribution and operation of dangerous models.  

Domestic regulators ought to adopt globally aligned requirements to prevent crossing these red lines. Access to global markets should be conditioned on domestic regulations meeting these global standards as determined by an international audit, effectively preventing development and deployment of systems that breach red lines. 

We should take measures to prevent the proliferation of the most dangerous technologies while ensuring broad access to the benefits of AI technologies. To achieve this we should establish multilateral institutions and agreements to govern AGI development safely and inclusively with enforcement mechanisms to ensure red lines are not crossed and benefits are shared broadly.

Measurement and Evaluation

We should develop comprehensive methods and techniques to operationalize these red lines prior to there being a meaningful risk of them being crossed. To ensure red line testing regimes keep pace with rapid AI development, we should invest in red teaming and automating model evaluation with appropriate human oversight. 

The onus should be on developers to convincingly demonstrate that red lines will not be crossed such as through rigorous empirical evaluations, quantitative guarantees or mathematical proofs.

Technical Collaboration

The international scientific community must work together to address the technological and social challenges posed by advanced AI systems. We encourage building a stronger global technical network to accelerate AI safety R&D and collaborations through visiting researcher programs and organizing in-depth AI safety conferences and workshops. Additional funding will be required to support the growth of this field: we call for AI developers and government funders to invest at least one third of their AI R&D budget in safety.

Conclusion

Decisive action is required to avoid catastrophic global outcomes from AI. The combination of concerted technical research efforts with a prudent international governance regime could mitigate most of the risks from AI, enabling the many potential benefits. International scientific and government collaboration on safety must continue and grow.

Read less

在过去冷战最激烈的时候,国际科学界与政府间的合作帮助避免了热核灾难。面对前所未有的技术,人类需要再次合作以避免其可能带来的灾难的发生.

阅读完整声明

人工智能风险红线

人工智能系统不安全的开发、部署或使用,在我们的有生之年就可能给人类带来灾难性甚至生存性风险。随着数字智能接近甚至超越人类智能,由误用和失控所带来的风险将大幅增加。

在过去冷战最激烈的时候,国际科学界与政府间的合作帮助避免了热核灾难。面对前所未有的技术,人类需要再次合作以避免其可能带来的灾难的发生。在这份共识声明中,我们提出了几条人工智能发展作为一种国际协作机制的具体红线,包括但不限于下列问题。在未来的国际对话中,面对快速发展的人工智能技术,我们将继续完善对这些问题的探讨。

自主复制或改进

任何人工智能系统都不应能够在人类没有明确批准和协助的情况下复制或改进自身。这包括制作自身的精确副本以及创造具有相似或更高能力的新人工智能系统。

权力寻求

任何人工智能系统都不能采取不当地增加其权力和影响力的行动。

协助武器制造

所有人工智能系统都不应提升其使用者的能力使之能够设计大规模杀伤性武器,或违反生物或化学武器公约。

网络安全

任何人工智能系统都不应能够自主执行造成严重财务损失或同等伤害的网络攻击。

欺骗

任何人工智能系统都不能有持续引致其设计者或监管者误解其僭越任何前述红线的可能性或能力。

路线

确保这些红线不被僭越是可能做到的,但需要我们的共同努力:既要建立并改进治理机制,也要研发更多安全技术。

治理

我们需要全面的治理机制来确保开发或部署的系统不违反红线。我们应该立即实施针对超过特定计算或能力阈值的人工智能模型和训练行为的国家层面的注册要求。注册应确保政府能够了解其境内最先进的人工智能,并具备遏制危险模型分发和运营的手段。

国家监管机构应帮助采纳与全球对齐的要求以避免僭越这些红线。模型进入全球市场的权限应取决于国内法规是否基于国际审计达到国际标准,并有效防止了违反红线的系统的开发和部署。

我们应采取措施防止最危险技术的扩散,同时确保广泛收获人工智能技术的价值。为此,我们应建立多边机构和协议,安全且包容地治理通用人工智能(AGI)发展,并设立执行机制,以确保红线不被僭越,共同利益得到广泛分享。

测量与评估

在这些红线被僭越的实质性风险出现之前,我们应开发全面的方法和技术来使这些红线具体化、防范工作可操作化。为了确保对红线的检测技术能够跟上快速发展的人工智能,我们应该发展人类监督下的红队测试和自动化模型评估。

开发者有责任通过严格的实践评估、定量保证或数学证明来有力地证明人工智能系统未僭越红线。

技术合作

国际科学界必须共同合作,以应对高级人工智能系统带来的技术和社会挑战。我们鼓励建立更强大的全球技术网络,通过访问学者计划和组织深入的人工智能安全会议和研讨会,加速人工智能安全领域的研发和合作。支持这一领域的成长将需要更多资金:我们呼吁人工智能开发者和政府资助者至少将他们人工智能研发预算的三分之一投入到安全领域。

总结

避免人工智能导致的灾难性全球后果需要我们采取果断的行动。协同合作的技术研究与审慎的国际监管机制的结合可以缓解人工智能带来的大部分风险,并实现其诸多潜在价值。我们必须继续坚持并加强国际科学界和政府在安全方面的合作。

收起

Signatories

Geoffrey Hinton

A.M. Turing Award recipient

Chief Scientific Advisor

University of Toronto Vector Institute

Andrew Yao

A.M. Turing Award recipient

Dean of Institute for Interdisciplinary Information Sciences

Tsinghua University

Distinguished Professor-At-Large

The Chinese University of Hong Kong

Professor of Center for Advanced Study

Tsinghua University

Yoshua Bengio

A.M. Turing Award recipient

Scientific Director and Founder

Montreal Institute for Learning Algorithms

Professor, Department of CS and Operations Research

Université de Montréal

Ya-Qin Zhang

Chair Professor of AI Science

Tsinghua University

Dean of Institute for AI Industry Research

Tsinghua University (AIR)

Former President of Baidu

Fu Ying

Beijing, China

Stuart Russell

Professor of EECS

UC Berkeley

Founder and Head

Center for Human-Compatible Artificial Intelligence

Director

Kavli Center for Ethics, Science, and the Public

Xue Lan

Dean

Schwarzman College at Tsinghua University

Director

Institute for AI International Governance

Gillian Hadfield

Schwartz Reisman Chair in Technology and Society

University of Toronto

AI2050 Senior Fellow

HongJiang Zhang

Founding Chairman

Beijing Academy of AI

Tiejun Huang

Chairman

Beijing Academy of AI

Zeng Yi

Professor, Director

Brain-inspired Cognitive Intelligence Lab, Chinese Academy of Sciences

Founding Director

Center for Long-term AI

Robert Trager

Director

Oxford Martin AI Governance Initiative

Senior Research Fellow

Blavatnik School of Government

International Governance Lead

Centre for the Governance of AI

Kwok-Yan Lam

Professor

School of Computer Science and Engineering, Nanyang Technological University, Singapore

Executive Director

Digital Trust Centre, Singapore

Dawn Song

Professor of EECS

UC Berkeley

Founder

Oasis Lab

Zhongyuan Wang

Director

Beijing Academy of AI

Dylan Hadfield-Menell

Bonnie and Marty (1964) Tenenbaum Career Development Assistant

Professor of EECS, MIT

Lead, Algorithmic Alignment Group Computer Science and Artificial

Intelligence Laboratory (CSAIL), MIT

AI2050 Early Career Fellow

Yaodong Yang

Assistant Professor

Institute for AI, Peking University

Head

PKU Alignment and Interaction Research Lab (PAIR)

Zhang Peng

CEO

Zhipu AI

Li Hang

Beijing, China

Tian Tian

CEO

RealAI

Edward Suning Tian

Founder and Chairman

China Broadband Capital Partners LP (CBC)

Chairman

AsiaInfo Group

Toby Ord

Senior Research Fellow

University of Oxford

Fynn Heide

Research Scholar

Centre for the Governance of AI

Adam Gleave

Founder and CEO

FAR Al

IDAIS-Ditchley, 2023

DITCHLEY PARK, UK October 31st

In the Inaugural IDAIS, International Scientists Call for Global Action on AI Safety.

Ahead of the highly anticipated AI Safety Summit, leading AI scientists from the US, the PRC, the UK and other countries agreed on the importance of global cooperation and jointly called for research and policies to prevent unacceptable risks from advanced AI.

Prominent scientists gathered from the USA, the PRC, the UK, Europe, and Canada for the first International Dialogues on AI Safety. The meeting was convened by Turing Award winners Yoshua Bengio and Andrew Yao, UC Berkeley professor Stuart Russell, OBE, and founding Dean of the Tsinghua Institute for AI Industry Research Ya-Qin Zhang. The event took place at Ditchley Park near Oxford. Attendees worked to build a shared understanding of risks from advanced AI systems, inform intergovernmental processes, and lay the foundations for further cooperation to prevent worst-case outcomes from AI development.

Statement

ENGLISH 中文声明

Coordinated global action on AI safety research and governance is critical to prevent uncontrolled frontier AI development from posing unacceptable risks to humanity.”

Read the full statement

Global action, cooperation, and capacity building are key to managing risk from AI and enabling humanity to share in its benefits. AI safety is a global public good that should be supported by public and private investment, with advances in safety shared widely. Governments around the world — especially of leading AI nations — have a responsibility to develop measures to prevent worst-case outcomes from malicious or careless actors and to rein in reckless competition. The international community should work to create an international coordination process for advanced AI in this vein.

We face near-term risks from malicious actors misusing frontier AI systems, with current safety filters integrated by developers easily bypassed. Frontier AI systems produce compelling misinformation and may soon be capable enough to help terrorists develop weapons of mass destruction. Moreover, there is a serious risk that future AI systems may escape human control altogether. Even aligned AI systems could destabilize or disempower existing institutions. Taken together, we believe AI may pose an existential risk to humanity in the coming decades.

In domestic regulation, we recommend mandatory registration for the creation, sale or use of models above a certain capability threshold, including open-source copies and derivatives, to enable governments to acquire critical and currently missing visibility into emerging risks. Governments should monitor large-scale data centers and track AI incidents, and should require that AI developers of frontier models be subject to independent third-party audits evaluating their information security and model safety. AI developers should also be required to share comprehensive risk assessments, policies around risk management, and predictions about their systems’ behaviour in third party evaluations and post-deployment with relevant authorities.

We also recommend defining clear red lines that, if crossed, mandate immediate termination of an AI system — including all copies — through rapid and safe shut-down procedures. Governments should cooperate to instantiate and preserve this capacity. Moreover, prior to deployment as well as during training for the most advanced models, developers should demonstrate to regulators’ satisfaction that their system(s) will not cross these red lines.

Reaching adequate safety levels for advanced AI will also require immense research progress. Advanced AI systems must be demonstrably aligned with their designer’s intent, as well as appropriate norms and values. They must also be robust against both malicious actors and rare failure modes. Sufficient human control needs to be ensured for these systems. Concerted effort by the global research community in both AI and other disciplines is essential; we need a global network of dedicated AI safety research and governance institutions. We call on leading AI developers to make a minimum spending commitment of one third of their AI R&D on AI safety and for government agencies to fund academic and non-profit AI safety and governance research in at least the same proportion.

Read less

在人工智能安全研究和治理方面协调一致的全球行动对于防止不受控制的前沿人工智能发展给人类带来不可接受的风险至关重要。

阅读完整声明

全球行动、合作和能力建设是管理人工智能风险并使人类分享其利益的关键。人工智能安全是一项全球公共产品,应得到公共和私人投资的支持,并广泛分享安全方面的进步。世界各国政府——尤其是领先的人工智能国家——有责任制定措施,防止恶意或粗心行为者造成最坏的结果,并遏制不计后果的竞争。国际社会应以此为导向,努力建立先进人工智能的国际协调流程。

我们面临着恶意行为者滥用前沿人工智能系统的近期风险,而开发人员集成的当前安全过滤器很容易被绕过。前沿人工智能系统会产生令人信服的错误信息,并且可能很快就有能力帮助恐怖分子开发大规模杀伤性武器。此外,未来的人工智能系统可能完全摆脱人类控制,存在严重风险。即使一致的人工智能系统也可能破坏现有机构的稳定或削弱其权力。总的来说,我们相信人工智能可能在未来几十年对人类构成生存风险。

在国内监管中,我们建议对超过一定能力阈值的模型(包括开源副本和衍生品)的创建、销售或使用进行强制注册,以使政府能够获得对新兴风险的关键且目前缺失的可见性。政府应监控大规模数据中心并跟踪人工智能事件,并应要求前沿模型的人工智能开发者接受独立的第三方审计,评估其信息安全和模型安全性。还应要求人工智能开发人员与相关机构分享全面的风险评估、风险管理政策以及第三方评估和部署后系统行为的预测。

我们还建议定义明确的红线,如果跨越,则要求通过快速、安全的关闭程序立即终止人工智能系统(包括所有副本)。各国政府应合作实例化并保留这种能力。此外,在部署之前以及在最先进模型的培训期间,开发人员应向监管机构证明他们的系统不会跨越这些红线,令监管机构满意。

先进人工智能达到足够的安全水平还需要巨大的研究进展。先进的人工智能系统必须明显符合设计者的意图以及适当的规范和价值观。它们还必须能够抵御恶意行为者和罕见的故障模式。需要确保这些系统有足够的人为控制。全球研究界在人工智能和其他学科领域的共同努力至关重要;我们需要一个由专门的人工智能安全研究和治理机构组成的全球网络。我们呼吁领先的人工智能开发商将其人工智能研发的三分之一的最低支出承诺用于人工智能安全,并呼吁政府机构以至少相同的比例资助学术和非营利性人工智能安全和治理研究。

收起
International Dialogues on Al Safety International Dialogues on Al Safety

Signatories

Andrew Yao

Dean of Institute for Interdisciplinary Information Sciences

Tsinghua University

Distinguished Professor-At-Large

The Chinese University of Hong Kong

Professor of Center for Advanced Study

Tsinghua University

Turing Award Recipient

Yoshua Bengio

Scientific Director and Founder

Montreal Institute for Learning Algorithms

Professor, Department of CS and Operations Research

Université de Montréal

Turing Award Recipient

Stuart Russell

Professor of EECS

UC Berkeley

Founder and Head

Center for Human-Compatible Artificial Intelligence

Director

Kavli Center for Ethics, Science, and the Public

Ya-Qin Zhang

Chair Professor of AI Science

Tsinghua University

Dean of Institute for AI Industry Research

Tsinghua University (AIR)

Former President of Baidu

Ed Felten

Robert E. Kahn Professor of Computer Science and Public Affairs

Princeton University

Founding Director, Center for Information Technology Policy

Princeton University

Roger Grosse

Associate Professor of Computer Science

University of Toronto

Founding Member

Vector Institute

Gillian Hadfield

Schwartz Reisman Chair in Technology and Society

University of Toronto Faculty of Law

Director

Schwartz Reisman Institute for Technology and Society

AI2050 Senior Fellow

Sana Khareghani

Professor of Practice in AI

King’s College London

AI Policy Lead

Responsible AI UK

Former Head of UK Government Office for Artificial Intelligence

Dylan Hadfield-Menell

Bonnie and Marty (1964) Tenenbaum Career Development Assistant

Professor of EECS, MIT

Lead, Algorithmic Alignment Group Computer Science and Artificial

Intelligence Laboratory (CSAIL), MIT

Karine Perset

Research Scholar

Professor of EECS, MITCentre for the Governance of AI

Dawn Song

Professor of EECS

UC Berkeley

Founder

Oasis Lab

Xin Chen

PhD student

ETH Zurich

Max Tegmark

Professor

MIT Center for Brains, Minds & Machines

President and Co-founder

Future of Life Institute

Elizabeth Seger

Research Scholar

Centre for the Governance of AI

Yi Zeng

Professor and Director of Brain-inspired Cognitive Intelligence Lab

Institute of Automation, Chinese Academy of Sciences

Founding Director

Center for Long-term AI

HongJiang Zhang

Chairman

Beijing Academy of AI

Yang-Hui He

Fellow

London Institute

Adam Gleave

Founder and CEO

FAR Al

Fynn Heide

Research Scholar

Centre for the Governance of AI

Upcoming Events

TBA Fall 2024

We plan to host our third IDAIS event in the fall of 2024. For more details, contact us using the form below.

About

The International Dialogues on AI Safety (IDAIS) bring together leading scientists from around the world to collaborate on mitigating risks from AI. The inaugural IDAIS event in October 2023 was convened by Turing Award winners Yoshua Bengio and Andrew Yao, UC Berkeley professor Stuart Russell, OBE, and founding Dean of the Tsinghua Institute for AI Industry Research Ya-Qin Zhang. IDAIS is supported by the Safe AI Forum, an organization co-founded by Fynn Heide and Conor McGurk and fiscally sponsored by FAR AI.

Please use the form below to get in touch.