In the late 1980s, when US President Ronald Reagan negotiated arms control treaties with the Soviet Union, he frequently employed the phrase “trust, but verify.” This comes from a Russian proverb (doveryai, no proveryai) and emphasizes the importance of verification for international agreements.
Today, many technology and policy leaders are beginning to imagine international agreements that aim to promote the safe and responsible development of advanced AI.
As my colleagues and I note in a recent research paper, many are striving to devise ways for humanity to avoid a race toward artificial superintelligence, citing concerns that this could result in global catastrophe. As summarized by Turing Award–winner Yoshua Bengio: “The most important thing to realize, through all the noise of discussions and debates, is a very simple and indisputable fact: while we are racing towards AGI [artificial general intelligence] or even ASI[artificial superintelligence], nobody currently knows how such an AGI or ASI could be made to behave morally, or at least behave as intended by its developers and not turn against humans” (emphasis in original).
Tech leaders such as Sam Altman have called for the establishment of a body similar to the International Atomic Energy Agency (IAEA) for AI regulation. Google DeepMind CEO Demis Hassabis recently described a “CERN for AI”model (CERN is the European Organization for Nuclear Research) as being the best path to achieve advanced AI safety. CIGI has expanded on these ideas in its publication “Framework Convention for Global AI Challenges.” A team led by Duncan Cass-Beggs has begun the process of imagining international AI governance institutions and joint labs for advanced AI development.
These proposals are not attainable immediately, of course. But as policy makers pay more attention to frontier AI development and better understand the potential risks, there may arise a future moment at which international proposals are achievable. With that in mind, we’ve examined the question of verification.
“Trust, but Verify,” Applied to AI
“At the signing ceremony, Mr. Reagan emphasized the extensive verification procedures that would enable both sides to monitor compliance with the treaty.” — The New York Times’ coverage of the signing of the Intermediate-Range Nuclear Forces Treaty (INF Treaty)
What would Reagan’s “trust, but verify” approach look like when applied to AI development? Our research team examined 10 techniques nations could use to detect non-compliance with potential international agreements.
Some methods can be implemented directly by countries without any additional agreements (national technical means). Others require approval from the nation being investigated (access-dependent methods). And a third category relies on agreements relating to advanced hardware (hardware-dependent methods).
Within those three categories, the 10 verification methods can be summarized as follows (for more details, see our paper):
National Technical Means
- Remote sensing: Detect unauthorized data centers and semiconductor manufacturing via visual and thermal signals.
- Whistle-blowers: Incentivize insiders to report non-compliance.
- Energy monitoring: Detect power consumption patterns that suggest the potential presence of large clusters of general processing units.
- Customs data analysis: Track the movement of critical AI hardware and raw materials.
- Financial intelligence: Monitor large financial transactions related to AI development.
Access-Dependent Methods
- Data centre inspections: Conduct inspections of sites to assess the size of a data centre, verify compliance with hardware agreements, and verify compliance with other safety and security agreements.
- Semiconductor manufacturing facility inspections: Conduct inspections of sites to determine the quantity of chip production and verify that chip production conforms to any agreements around advanced hardware.
- AI developer inspections: Conduct inspections of AI development facilities via interviews, document and training transcript audits, and potential code reviews.
Hardware-Dependent Methods
- Chip location tracking: Automatic location tracking of advanced AI chips.
- Chip-based reporting: Automatic notification if chips are used for unauthorized purposes.
What’s Next?
Our research is an attempt at a first step toward building common knowledge and understanding around verification. It is intended to stimulate further work.
Many open questions remain, in particular about the robustness of verification, the monitoring and enforcement of international agreements, and feasibility of verification using hardware.
How Robust Are International Verification Regimes?
Future work could explore how the verification methods could be combined into a comprehensive verification regime. Furthermore, such work could examine how adversaries might try to evade or circumvent the verification regime. Red-teaming and blue-teaming exercises could be used to anticipate potential strategies by adversaries and improve the robustness of verification regimes.
What Institutions Could Help Monitor and Enforce International AI Agreements?
The IAEA helps to monitor violations of the Treaty on the Non-Proliferation of Nuclear Weapons, and the Organisation for the Prohibition of Chemical Weapons helps to monitor violations of the Chemical Weapons Convention. Is a new international institution needed to govern the development of advanced AI or detect violations with potential AI treaties? If so, how should such an institution be governed, and how should it resolve potential disputes between nations?
How Feasible Are Hardware-Dependent Methods?
Further technical and policy work on hardware-enabled mechanisms could help us understand the feasibility of various kinds of hardware-dependent verification methods. What kinds of features can be implemented via advanced hardware? How difficult is it to prevent or detect efforts to tamper with hardware-enabled mechanisms? Which hardware-enabled mechanisms could already be implemented, and which ones require additional research and development? (Interested readers should see this RAND report on hardware-enabled governance mechanisms.)
During the Cold War, it was crucial to trust, but verify, when negotiating key agreements around nuclear weapons — including the INF Treaty and the Strategic Arms Reduction treaties. These agreements were not perfect — and they have faced particularly important challenges during periods of heightened geopolitical tensions. But all things considered, they set an important precedent that future work can draw from.
Today, nations face important questions about how to ensure the safe and responsible development of advanced AI. Initial steps toward international coordination have already taken place: nations have organized global AI Safety Summits, an international network of AI Safety Institutes, track I and track II dialogues about AI safety, and an International Scientific Report on the Safety of Advanced AI.
As these efforts come to fruition, we can expect to see nations seriously consider international governance agreements. Humanity’s ability to reliably monitor compliance will be essential in moving these discussions forward.
This article was drawn from a recent research paper led by CIGI Senior Research Affiliate Akash Wasil. For questions or collaboration requests, please contact the author.