0%
Still working...

Toward a Realpolitik for AI


In international relations, “realism” insists that nation-states should only be understood as acting to maximize their own self-interests. When such thinking debuted following World War II, this realism was presented as a contrast to the “idealism” that had characterized the interwar years from 1918 to 1939, with their utopian hopes that nation-states could work together toward universal principles or common interests. Such utopian dreams died hard in the renewed conflicts of World War II, as diplomacy, democracy, and even morality failed to deliver their promises in the face of economic upheavals, reactionary politics, and the relentless growth of fascism. Political leaders began to ask whether, since idealism had failed even to predict the conflict—let alone stop it—realism might now offer a welcome clarity?

Today’s AI arms race demands another healthy dose of realism. Put another way, we can’t understand any of the new narratives about the power of AI—whether they portray AI as helpful or harmful, Satan or savior—without taking a hard look at how all these stories serve Big Tech’s own self-interest and expand the authority it is granted across society. In the race between companies and countries to dominate AI, then, we must understand that the real, tangible effect of nearly all of the hype is to stabilize existing structures of power.

This is because such narratives silence what is normally the main question of business: whether the products they currently produce work as advertised, or are more AI “snake oil.” As long as the products work well enough to convince people that they could work better, investment will continue to flow toward those companies’ interests, and their customers will rush not to miss the boat.

As an anthropologist interested in how AI and computing works within organizations, I studied how this newfound authority of AI was actively produced within the setting of a small machine learning research lab. There, a need to demonstrate the power of machine learning for potential customers might be turned into, for example, a tool to distinguish between more and less legitimate news sources. In doing so, engineers transformed an exercise in comparative literature and media studies into a classification problem that carved a statistical boundary between how often certain kinds of sentences appear across articles from different kinds of newspapers. Other demonstrations like this one are common—DeepMind demoed AlphaGo as early as 2015 and NVIDIA demoed generative art tools in 2017—and also serve to feed a public imaginary about the potential of machine learning, with individual instances of hyping converging to set expectations, drive investments, and produce credulity for the outsized claims made by developers about AI’s power. This is, indeed, a positive feedback cycle in which demonstrations of power feed investments that lead to even more persuasive demonstrations, feeding even greater investment. The late 2022 release of ChatGPT exemplifies this cycle. Originally released as a research demo, it has since become ubiquitous and serves as the template for the integration of AI in almost every computing platform imaginable, whether users want it or not.

The unsolicited ubiquity of AI is confounding, but two recent books target head on the hype that produced this ubiquity. In AI Snake Oil, Arvind Narayanan and Sayash Kapoor help the reader distinguish what AI can and cannot do, and how that distinction collapses in the face of the hype that emanates from Silicon Valley and other quarters. By contrast, in Taming Silicon Valley, Gary Marcus leaves the hype intact, insisting that AI technologies are just as—or even more—powerful than their proponents suggest. The challenge facing society, Marcus argues, is deciding whether to harness the political will to guide them in the right direction or leave them unchecked. A third recent book, Middle Tech, proves indispensable for demonstrating how the mundane but nevertheless consequential choices inside AI companies, indeed in all software companies, are made by engineers as they navigate the material and ethical stakes of what is “good enough” and who it is good enough for in their daily work.


There was a time, not too long ago, when artificial intelligence was not the most obvious answer to every problem. But somehow since then, the exact same set of approaches, loosely bundled under the moniker of “AI,” can now be proposed as the solution to problems across domains as varied as medicine, criminal justice, materials science, e-commerce, and online content moderation. But AI approaches—which all involve collecting data, feeding it to algorithmic systems, and automating some task that humans find to be marginally boring, difficult, or expensive to do—have gradually gained the authority to move into each of these domains, and many more.

In many cases, these approaches seem like a good idea, and not only because they solve the actual problems people have. Instead, they are being adopted because they help turn peoples’ problems into the kind of problems algorithmic systems are good at solving, given enough data.

Artificial intelligence “works well enough that … companies have come to rely on it,” clarifies AI Snake Oil, even though it doesn’t do many of the grandiose things that AI developers claim it can. Narayanan and Kapoor start by placing AI in context, showing how a range of software engineering approaches comprise what commonly is labeled, often unreflectively, as “AI.” They then show, for example, how AI cannot predict the future because all it knows is what has happened in the past, and how AI cannot moderate social media content because it never has access to social context.

By the end of the book, what remains for the reader is the sense that AI only seems helpful because real solutions (e.g., funding social services adequately instead of using AI to predict who deserves social support) are lacking. But if there are so many things AI systems cannot do as advertised, what does it mean for them to work well “enough” for companies to invest in and release to the public?

Here, Bialski’s ethnographic work embedded in the tech industry helps explain what “good enough” means for those working on software. Although she observes notable differences across software engineers based in the Bay Area and Berlin, their determinations about what exactly constitutes the “good enough” shapes the daily patterns of their work, the subtle ordinary ethics of working as software developers, and ultimately the effects of their work on the world.

Early in the book, she contrasts engineers in Big Tech companies who work late into the night obsessing over code that aims to “change the world” with engineers working at “MiddleTech,” a mid-sized software firm in Germany, who keep regular working hours but nevertheless deliver functional code on schedule. When the end of the day rolls around, she describes how these German developers might cursorily sign off on their code reviews of fellow programmers without a glance, so they can all go out for a beer together on a sunny day.

The lesson she draws from this contrast is not that German engineers are lazy and thirsty compared to engineers in Silicon Valley. Far from it. Getting to “good enough” is hard work, and there is “excellence” to be found in both locales. To get to good enough, these MiddleTech developers must surmount the limited options left to them by legacy code written long ago, the partial knowledge they have access to in their organizational silos, and the limits of their own expertise. And so they have to navigate what Bialski calls “software’s sociality.” This sociality pervades software development, AI development included.

At MiddleTech, this sociality takes place in daily stand-up meetings, democratic voting on possible courses of action, and quickly approving one another’s code at the end of a long day. In doing so, workers enact an ethics of trust and solidarity with and through one another. 

But there is a sociality to software in Big Tech companies that shapes how engineers and teams interact and form relationships, too, which I have seen myself while studying how work is structured in Silicon Valley. Large teams orient their work toward “objectives and key results” that enable them to coordinate their efforts across geography and through time. Separate companies are also drawn together through a sociality that forms in the pursuit of benchmarks. Social media companies compete for the greater number of “daily active users,” and AI companies compete to meet performance benchmarks on “leaderboards” for any number of AI tasks.

These goals enact an ethics as well, in how they are socially constructed as desirable goals worth pursuing, in how they are pursued, and in how decisions are made about which goals count as “good enough.” These goals—including benchmarks for the ability of an AI system to answer SAT-like questions or faithfully summarize a document, key performance indicators for development teams that are set just at the limit of achievability, or conformity to technical standards that ensure interoperability with other systems—are also signals for when good enough has been achieved. They are the interface between claims and reality. A company may say it plans to reach, as OpenAI has claimed from its earliest days, “safe artificial general intelligence” (artificial general intelligence being a term of art for software systems that can replicate or exceed human capabilities). But then later, it must set thresholds for what achieving this underspecified goal might mean, concretely, for the behavior of a software system.

This is, unsurprisingly, nearly impossible. After all, the definitions of both “intelligence” and “safety” are amorphous. More importantly, these goals are bound up in totalizing projects like eugenics or national security, which have historically produced profound violence for people and communities.

Nevertheless, Big Tech companies make their claims, and then they set the concrete goals their teams need to be good enough to achieve. AI Snake Oil provides countless examples of how so many of these research papers prove difficult to replicate, or can only be replicated under specific circumstances that are still quite distant from the general capability claimed as a success.

Still, as Narayanan and Kapoor so ably show, it is difficult to discern what is AI snake oil and what is not. Moreover, there is tremendous momentum behind the discourse declaiming the transformational power of AI. And so grand claims persist.

the authority of AI has been secured through rampant speculation about the technology’s capabilities that might emerge in the future.

The most impactful such claim that has dominated the discourse around artificial intelligence in the past few years has been that generative AI, which can produce seemingly useful text, images, and audio, marks the arrival of truly transformational AI capabilities. Narayanan and Kapoor are clear that the release of ChatGPT and the attendant hyping of the capabilities of generative AI motivated them to write AI Snake Oil, even if they instruct readers in the finer points of how to identify overblown claims about other forms of AI as well. Similarly, Marcus spends fully a third of Taming Silicon Valley outlining how big US tech companies harness the hype around AI—whether it works or not—to increase their bottom lines and use their economic power to marshal public opinion against regulation of their technologies.

Such claims that focus on profit and power align with my own research. How, I investigated, did AI seize authority over so many spheres of public and private life? The answer I found—which is echoed in Marcus’s work—is that this is achieved not so much through clever demonstrations of interesting feats like human-level competition at a complex game like Go, but instead, the authority of AI has been secured through rampant speculation about the technology’s capabilities that might emerge in the future.

So long as the current iterations of AI systems appear to point toward a future where they might soon exceed humans at a wide range of tasks, companies are willing to exchange current capital expenditures for future labor cost savings. They adopt the latest AI services, invest in scarce hardware, and restructure their organizations around the promises of new efficiencies in the workplace. As they do so, they cede authority over how they accomplish their goals to methods that can be accomplished, at least in part, by AI instead of their own experts. Human experts working in organizations have norms and practices that have evolved with the organization, to meet its specific goals and articulate with other practices that ensure quality and value. Writers in a newsroom don’t just generate text, they write in ways that editorial staff can work with to refine and lay out in a finished product. But AI tools pool the results of these practices, from across organizations, to produce outputs that other parts of a company must contort themselves to work with. Organizations that work with AI increasingly dance to the tune called by distant developers, and are less able to choreograph their own affairs.

How might such emerging authority be combatted? Here, Marcus intervenes in three important ways. He first points out how generative AI does not satisfy the inflated expectations its proponents have created. Having long worked in developing AI technologies, he debunks the outsized claims made about generative AI as capably as Narayanan and Kapoor.

He then shows how developers are able to influence AI policy by raising the specter of a “global AI arms race” between those who lose by overregulating AI and those who win by freeing technologists from undue regulatory pressure. Genuinely concerned about what it might mean for the US to lose such a race with China, Marcus unpacks this narrative and its stifling influence on policy discussions. He points out that the kinds of regulation he calls for, like pre- and post-assessments of powerful models by an FDA-like agency, are seen by lawmakers and developers alike as potential stumbling blocks for the US in a footrace with other countries. This observation is certainly true, as Congressional leaders and CEOs alike have articulated this very thought, although it is less clear how such a footrace has clear winners and losers. Marcus outlines this ambiguity in a re-encapsulation of his 2023 Congressional testimony, arguing for how authority over how AI can be regained through a normal political process that buttresses the rights of individuals and exposes AI developers to mechanisms of accountability for how they affect individuals and society writ large. Beyond the FDA-like agency Marcus calls for in his own Congressional testimony, he also calls for a global, multinational effort to pursue safe and responsible AI outside the context of narrow national interests.

Read from early in 2025, the call for meaningful regulatory action on AI from the US Congress in Taming Silicon Valley seems overly optimistic, even though only a few months have passed since it was written. But Marcus’s thesis remains correct: “the only chance at all is for the rest of us to speak up, really loudly.” He goes on to describe the specific ways “we can push for a better, more reliable, safer AI that is good for humanity, rather than the premature, rushed-out-the-door technology we have now that is making a few people very rich, but threatening the livelihoods of many others.”


And it is here that Marcus brings the conversation back to what it means for AI to work, as Narayanan and Kapoor put it, “well enough that … companies have come to rely on it.” Adversarial as Taming Silicon Valley is toward those companies that rely on AI working well enough, it offers a window into who it works well enough for and what it works well enough to do. But to see clearly through that window requires even more realism about the grand plans of Silicon Valley companies than Marcus provides. Bialski’s realism is helpful, certainly, as is Narayanan and Kapoor’s, at maintaining focus on how the work of developers inside the Big Tech companies pursuing AGI always exists in relation to that which is good enough, at least from moment to moment, in meeting the checkpoints and benchmarks that allow companies to claim progress is being made toward the goal of AGI, which keeps the hype cycle spinning. But to understand who AI works well enough for requires another variety of realism, that of the 20th-century realpolitik that characterized international relations after World War II.

A realpolitik of AI would recognize that the fundamental insights of organizational sociology—that organizations’ activities can be understood as pursuing legitimacy and survival—remain as true for AI companies as they have been for any other organization. These two priorities explain their “real” motivation and goals, as both legitimacy and survival must, in some ways, be manufactured alongside the products and services they sell.

In this light, the value of “AI snake oil” is not the product itself, which may or may not work, but instead is associating the technology with problems that people genuinely care about. Similarly, the value of pitting one country’s technological prowess against another’s in a global “AI arms race” is not just a motivation for common sense regulation, but instead raises the stakes of AI regulation from mundane industrial policy to an existential concern in which the nation’s fortunes are mapped onto those of private AI developers. However, given the concrete damage AI snake oil can cause for individuals, and how corrosive the rush to deploy AI as a substitute for bureaucratic institutions promises to be for self-governance, we must collectively ask ourselves whether these rationales for the rapid adoption of AI are, indeed, “good enough.” icon

This article was commissioned by Mona Sloane.

Featured-image photograph by Albert Stoynov / Unsplash (CC by Unsplash License)



Source link

Recommended Posts