Voluntary commitments insufficient to end race to godlike AI

Binding policies are necessary to mitigate risks from unfettered development

Earlier this year, many artificial intelligence experts signed a letter stating that extinction risks from AI should be a global priority. Even executives of AI companies have stated that they believe smarter-than-human AI systems are the ‘greatest threat to the existence of humanity’, and AI has a 10% to 25% chance of causing a civilisation-wide catastrophe. It is clear that these risks are credible and worth taking seriously.

Catastrophic risks could be mitigated if AI experts had sufficient time, patience and caution to ensure that they know how to control extremely powerful systems. Right now, the frontier of AI development is not characterised by caution or patience. It is characterised by a few powerful AI companies that are racing towards godlike AI. Companies are pouring billions of dollars into the development of increasingly powerful systems.

There are public policy solutions that could substantially reduce this reckless race towards AI catastrophe. However, so far, AI companies are not advocating for them – and some evidence suggests that they are actively opposing meaningful regulations.

Some AI companies believe that they should be allowed to continue racing, as long as they commit to running tests that could help us identify dangerous systems. This belief has been codified via responsible scaling policies. The Alignment Research Center, a non-profit organisation that works closely with frontier AI companies, released a responsible scaling policy framework. Anthropic, one of the leading AI labs, has released a responsible scaling policy that was developed in conjunction with ARC.

The RSP describes some voluntary commitments from Anthropic. The lab plans to perform some tests to detect model capabilities at least once per 400% increase in the computational resources that its models use. Additionally, once Anthropic has systems that ‘increase the risk of catastrophic misuse’ (such as systems that can enable large-scale biological attacks), it will describe what it plans to do to control even more dangerous systems (systems that can escape human control).

Unfortunately, it is easy to overestimate the utility of RSPs.

Three problems with the responsible scaling plan

First, catastrophic risks should be managed by governments, not big tech companies. There is nothing wrong in principle with companies making voluntary commitments. However, RSPs should not be used as a justification to avoid governmental involvement. Frontier AI labs are run by individuals who believe their technology has at least a 10% chance of causing a catastrophe, and they believe their AI systems will be able to enable large-scale biological attacks within two to three years.

We would not let big tech chief executive officers run responsible nuclear facilities or supervirus labs. When the risks are this high, society recognises that governments ought to get involved and provide oversight.

Second, the RSP plan relies on tests that do not exist. In many other fields, we know how to measure danger. For example, the aviation industry and the nuclear energy field have long and detailed protocols; we know how to detect unsafe airplanes and power plants. In AI safety, we are far behind. AI scientists do not have a comprehensive list of catastrophically dangerous capabilities.

Even among the dangerous capabilities we are aware of, we do not have reliable or verifiable tests. As Ian Hogarth, chair of the UK’s AI Foundation Model Taskforce, has noted, progress in AI capabilities has substantially outpaced progress in safety research. There is a high chance that AI capabilities research continues to outpace our ability to understand or detect dangers in these systems, providing a major blow to the RSP plan.

Third, companies will break their RSPs if they are worried about ‘irresponsible’ competitors winning the race to godlike AI. Suppose a dangerous capability evaluation is triggered; what happens next? Ideally, the AI company would be able to stop further development until it understands what went wrong and how to fix it. When it comes to controlling smarter-than-human AI systems, this might take many years of careful, cautious AI safety research.

However, in the context of a race to godlike AI, no company will be able to afford that amount of time. If Company A needs two years to figure out how to control its system, but it fears that Company B is only two months behind, Company A will feel pressure to violate its RSP and scale up anyway. This is why both ARC and Anthropic have written that companies are allowed to break their RSPs if they are worried about irresponsible competitors getting to godlike AI. RSPs are voluntary, and companies even admit that they will break them.

A positive vision for AI regulation

I commend ARC and Anthropic for looking for ways to address the dangers of this technology. Nonetheless, I think the limitations of RSPs have not been communicated adequately, and I hope these groups (and others) ensure that policy-makers are aware that voluntary commitments are insufficient.

What would better regulation look like? To stop the race to godlike AI and allow for sufficient time for AI safety research, we need government regulation and international coordination. In particular, we need the government to be able to respond to an AI-related emergency.

There are a few proposals that could achieve this. First, some are calling for a global compute cap: a ban on AI development above a certain amount of computing resources. This would be the most robust way to limit dangerous AI development while allowing society to reap the benefits of current AI systems.

Second, in the US, some senators have called for a licensing agency for frontier AI systems. While less robust than a global compute cap, a licensing agency could allow government regulators to prohibit the development and deployment of systems unless companies have shown that they are sufficiently safe. Such an agency could also have emergency powers that allow it to implement a national compute cap in response to an AI-related emergency or evidence of imminent risks.

Third, some have called for an International Atomic Energy Agency-like organisation that would regulate dangerous AI development like society regulates dangerous nuclear technologies. A group of forecasters recently proposed a similar idea via an international treaty.

Such proposals raise important questions about the concentration of AI development, which parties should be allowed to develop such dangerous technologies and how the international community can coordinate around such risks. Nonetheless, the world has come together to regulate dangerous technologies in the past, and it will need to do so again with AI. While voluntary commitments can be helpful, we should not let them distract us from what we need: binding regulations that curtail the race to godlike AI.

Akash Wasil is an artificial intelligence governance and policy researcher.

This article was published in the Autumn 2023 edition of the Bulletin.

 

 

 

 

 

Join Today

Connect with our membership team

Scroll to Top