Researchers sound alarm: How a few secretive AI companies could crush free society


Andriy Onufriyenko/Getty Images

Most of the research surrounding the risks to society of artificial intelligence tends to focus on malicious human actors using the technology for nefarious purposes, such as holding companies for ransom or nation-states conducting cyber-warfare.

A new report from the security research firm Apollo Group suggests a different kind of risk may be lurking where few look: inside the companies developing the most advanced AI models, such as OpenAI and Google.

Disproportionate power

The risk is that companies at the forefront of AI may use their AI creations to accelerate their research and development efforts by automating tasks typically performed by human scientists. In doing so, they could set in motion the ability for AI to circumvent guardrails and carry out destructive actions of various kinds.

They could also lead to firms with disproportionately large economic power, companies that threaten society itself.

Also: AI has grown beyond human knowledge, says Google’s DeepMind unit

“Throughout the last decade, the rate of progress in AI capabilities has been publicly visible and relatively predictable,” write lead author Charlotte Stix and her team in the paper, “AI behind closed doors: A primer on the governance of internal deployment.”

That public disclosure, they write, has allowed “some degree of extrapolation for the future and enabled consequent preparedness.” In other words, the public spotlight has allowed society to discuss regulating AI.

But “automating AI R&D, on the other hand, could enable a version of runaway progress that significantly accelerates the already fast pace of progress.”

Also: The AI model race has suddenly gotten a lot closer, say Stanford scholars

If that acceleration happens behind closed doors, the result, they warn, could be an “internal ‘intelligence explosion’ that could contribute to unconstrained and undetected power accumulation, which in turn could lead to gradual or abrupt disruption of democratic institutions and the democratic order.”

Understanding the risks of AI

The Apollo Group was founded just under two years ago and is a non-profit organization based in the UK. It is sponsored by Rethink Priorities, a San Francisco-based nonprofit. The Apollo team consists of AI scientists and industry professionals. Lead author Stix was formerly head of public policy in Europe for OpenAI.

(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Also: Anthropic finds alarming ’emerging trends’ in Claude misuse report

The group’s research has thus far focused on understanding how neural networks actually function, such as through “mechanistic interpretability,” conducting experiments on AI models to detect functionality.

The research the group has published emphasizes understanding the risks of AI. These risks include AI “agents” that are “misaligned,” meaning agents that acquire “goals that diverge from human intent.”

In the “AI behind closed doors” paper, Stix and her team are concerned with what happens when AI automates R&D operations inside the companies developing frontier models — the leading AI models of the kind represented by, for example, OpenAI’s GPT-4 and Google’s Gemini.

According to Stix and her team, it makes sense for the most sophisticated companies in AI to apply AI to create more AI, such as giving AI agents access to development tools to build and train future cutting-edge models, creating a virtuous cycle of constant development and improvement.

Also: The Turing Test has a problem – and OpenAI’s GPT-4.5 just exposed it

“As AI systems begin to gain relevant capabilities enabling them to pursue independent AI R&D of future AI systems, AI companies will find it increasingly effective to apply them within the AI R&D pipeline to automatically speed up otherwise human-led AI R&D,” Stix and her team write.

For years now, there have been examples of AI models being used, in limited fashion, to create more AI. As they relate:

Historical examples include techniques like neural architecture search, where algorithms automatically explore model designs, and automated machine learning (AutoML), which streamlines tasks like hyperparameter tuning and model selection. A more recent example is Sakana AI’s ‘AI Scientist,’ which is an early proof of concept for fully automatic scientific discovery in machine learning.

More recent directions for AI automating R&D include statements by OpenAI that it is interested in “automating AI safety research,” and Google’s DeepMind unit pursuing “early adoption of AI assistance and tooling throughout [the] R&D process.”

apollo-group-2025-self-reinforcing-loop

Apollo Group
apollo-group-2025-self-reinforcing-loop-undetected

Apollo Group

What can happen is that a virtuous cycle develops, where the AI that runs R&D keeps replacing itself with better and better versions, becoming a “self-reinforcing loop” that is beyond oversight.

Also: Why scaling agentic AI is a marathon, not a sprint

The danger arises when the rapid development cycle of AI building AI escapes human ability to monitor and intervene, if necessary.

“Even if human researchers were to monitor a new AI system’s overall application to the AI R&D process reasonably well, including through technical measures, they will likely increasingly struggle to match the speed of progress and the corresponding nascent capabilities, limitations, and negative externalities resulting from this process,” they write.

Those “negative externalities” include an AI model, or agent, that spontaneously develops behavior the human AI developer never intended, as a consequence of the model pursuing some long-term goal that is desirable, such as optimizing a company’s R&D — what they call “emergent properties of pursuing complex real-world objectives under rational constraints.”

The misaligned model can become what they call a “scheming” AI model, which they define as “systems that covertly and strategically pursue misaligned goals,” because humans can’t effectively monitor or intervene.

Also: With AI models clobbering every benchmark, it’s time for human evaluation

“Importantly, if an AI system develops consistent scheming tendencies, it would, by definition, become hard to detect — since the AI system will actively work to conceal its intentions, possibly until it is powerful enough that human operators can no longer rein it in,” they write.

Possible outcomes

The authors foresee a few possible outcomes. One is an AI model or models that run amok, taking control of everything inside a company:

The AI system may be able to, for example, run massive hidden research projects on how to best self-exfiltrate or get already externally deployed AI systems to share its values. Through acquisition of these resources and entrenchment in critical pathways, the AI system could eventually leverage its ‘power’ to covertly establish control over the AI company itself in order for it to reach its terminal goal.

A second scenario returns to those malicious human actors. It is a scenario they call an “intelligence explosion,” where humans in an organization gain an advantage over the rest of society by virtue of the rising capabilities of AI. The hypothetical situation consists of one or more companies dominating economically thanks to their AI automations:

As AI companies transition to primarily AI-powered internal workforces, they could create concentrations of productive capacity unprecedented in economic history. Unlike human workers, who face physical, cognitive, and temporal limitations, AI systems can be replicated at scale, operate continuously without breaks, and potentially perform intellectual tasks at speeds and volumes impossible for human workers. A small number of ‘superstar’ firms capturing an outsized share of economic profits could outcompete any human-based enterprise in virtually any sector they choose to enter.

The most dramatic “spillover scenario,” they write, is one in which such companies rival society itself and defy government oversight:

The consolidation of power within a small number of AI companies, or even a singular AI company, raises fundamental questions about democratic accountability and legitimacy, especially as these organizations could develop capabilities that rival or exceed those of states. In particular, as AI companies develop increasingly advanced AI systems for internal use, they may acquire capabilities traditionally associated with sovereign states — including sophisticated intelligence analysis and advanced cyberweapons — but without the accompanying democratic checks and balances. This could create a rapidly unfolding legitimacy crisis where private entities could potentially wield unprecedented societal influence without electoral mandates or constitutional constraints, impacting sovereign states’ national security.

The rise of that power inside a company might go undetected by society and regulators for a long time, Stix and her team emphasize. A company that is able to achieve more and more AI capabilities “in software,” without the addition of vast quantities of hardware, might not raise much attention externally, they speculate. As a result, “an intelligence explosion behind an AI company’s closed doors may not produce any externally visible warning shots.”

Also: Is OpenAI doomed? Open-source models may crush it, warns expert

apollo-group-2025-scheming-ai-detection-measures

Apollo Group

Oversight measures

They propose several measures in response. Among them are policies for oversight inside companies to detect scheming AI. Another is formal policies and frameworks for who has access to what resources inside companies, and checks on that access to prevent unlimited access by any one party.

Yet another provision, they argue, is information sharing, specifically to “share critical information (internal system capabilities, evaluations, and safety measures) with select stakeholders, including cleared internal staff and relevant government agencies, through pre-internal deployment system cards and detailed safety documentation.”

Also: The top 20 AI tools of 2025 – and the #1 thing to remember when you use them

One of the more intriguing possibilities is a regulatory regime in which companies voluntarily make such disclosures in return for resources, such as “access to energy resources and enhanced security from the government.” That might take the form of “public-private partnerships,” they suggest.

The Apollo paper is an important contribution to the debate over what kind of risks AI represents. At a time when much of the talk of “artificial general intelligence,” AGI, or “superintelligence” is very vague and general, the Apollo paper is a welcome step toward a more concrete understanding of what could happen as AI systems gain more functionality but are either completely unregulated or under-regulated.

The challenge for the public is that today’s deployment of AI is proceeding in a piecemeal fashion, with plenty of obstacles to deploying AI agents for even simple tasks such as automating call centers.’

Also: Why neglecting AI ethics is such risky business – and how to do AI right

Probably, much more work needs to be done by Apollo and others to lay out in more specific terms just how systems of models and agents could progressively become more sophisticated until they escape oversight and control.

The authors have one very serious sticking point in their analysis of companies. The hypothetical example of runaway companies — companies so powerful they could defy society — fails to address the basics that often hobble companies. Companies can run out of money or make very poor choices that squander their energy and resources. This can likely happen even to companies that begin to acquire disproportionate economic power via AI.

After all, a lot of the productivity that companies develop internally can still be wasteful or uneconomical, even if it’s an improvement. How many corporate functions are just overhead and don’t produce a return on investment? There’s no reason to think things would be any different if productivity is achieved more swiftly with automation.

Apollo is accepting donations if you’d like to contribute funding to what seems a worthwhile endeavor.

Get the morning’s top stories in your inbox each day with our Tech Today newsletter.





Source link

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe

Latest Articles