Welcome to Verses Over Variables, a newsletter exploring the world of artificial/ intelligence (AI) and its influence on our society, culture, and perception of reality.

AI Hype Cycle

Don’t Waive the Inspection

Most of us have had this experience at least once. We send a prompt, something comes back looking polished and complete, and we think, done, and paste it into a proposal that goes out under our name. And then, three days later, we find the error or the quietly confident claim that turned out to be wrong in a way that would have taken thirty seconds to catch. The output looked finished, but that is not the same thing as being finished.

Anthropic just published The AI Fluency Index, an attempt to measure whether people are actually good at using AI rather than just whether they are using it at all. The distinction matters more than the industry wants to admit. Usage is easy to track. Competence is harder, and considerably more inconvenient to report.

The encouraging part: most people, when they use Claude for a real task, stay in the conversation. They push back, ask follow-up questions, refine the output rather than accepting the first response and walking away. This single habit, just staying in the room, turns out to be the strongest predictor of every other good behavior. The people who iterate are the people who catch errors, question reasoning, and notice what is missing. None of that is surprising if you have ever learned anything difficult. The feedback loop is the learning.

The troubling part is what Anthropic calls the artifact problem, and it is genuinely illuminating about something much older than AI.

Think about an open house. The sellers have done everything right. Fresh flowers on the kitchen table, lighting calibrated to make the living room feel warm, furniture arranged with just enough negative space to suggest possibility. The house shows beautifully. And what happens to buyers who fall in love with the staging? They waive the inspection. The foundation, the wiring, the slow leak behind the wall that the sellers painted over last Tuesday, none of it gets looked at because the surface said: this is the one.

AI artifacts are staged homes. When Claude produces something that looks finished, a document, a piece of code, a polished deliverable of any kind, users become less critical, not more. They actually put in more effort at the start of those conversations, carefully setting up the collaboration, specifying formats, and providing context. They show up like serious buyers with a checklist. And then the artifact lands on screen, authoritative and complete-looking, and the checklist goes in a drawer.

This is not an AI problem. It is a human pattern that AI is very good at triggering. We assess trustworthiness through surface signals: formatting, tone, confidence, the visual grammar of a finished thing. AI has become extraordinarily good at staging. Our pattern-matching brains say this looks like a trustworthy document, and we treat it accordingly. We waive the inspection.

What makes this finding uncomfortable for everyone, including Anthropic, is the timing problem. Their own research shows that the most complex, high-stakes tasks are exactly where the model is most likely to struggle. So the moment when a thorough inspection matters most is precisely the moment when buyers are most likely to skip it. We prepare better, direct better, and then evaluate less, right when evaluation counts.

The study also surfaced a number that should probably be on every organization’s AI dashboard. Only 30% of users ever tell the model how they want it to interact with them. Things like: push back if my assumptions are wrong. Tell me what you are uncertain about. Walk me through your reasoning before giving me the answer. The other 70% are running a collaboration without ever negotiating its terms, which is a bit like hiring a buyer’s agent and never telling them your budget or your dealbreakers, and then being surprised when the houses they show you are almost right but not quite.

What we find most useful about this research is how it reframes the whole conversation about AI skill. The industry has spent two years talking about prompt engineering, about knowing the right syntax and the right tools and the right way to structure a query. That framing turns AI fluency into a technical credential. The framework underlying this study defines it differently: as the capacity to collaborate well, which includes describing what you want, delegating appropriately, exercising discernment about what comes back, and maintaining diligence about how you use it. Those are human capacities. The people who are best at this are the people who stay curious, stay skeptical, and never quite trust the staged home enough to skip the walkthrough

The AI Adoption that Wasn’t

fal.ai’s new State of Generative Media report opens with an interesting stat: 88% of organizations deployed AI in at least one business function in 2025. “At least one function” is doing so much structural load-bearing in that claim that you almost have to respect it. The number that actually matters: personal AI adoption hit 89% last year. Organizational deployment landed at 57%. The employees already adopted AI on their own, building it quietly into how they work, and their companies are still in a conference room somewhere deciding whether to allow it.

In the advertising industry, this gap achieves something approaching performance art. Seventy-five percent of agencies report using generative AI, up from 61% the year before. Sounds like a revolution. Then you read the next line: 80% of those same agencies use it on less than half their actual work. So the industry has basically held the meeting, updated the capabilities deck, sent the breathless press release, and then mostly continued doing what it was doing before. Also, 94% cited IP ownership concerns as the thing blocking full deployment, which is legitimate (AI copyright law is genuinely unsettled in ways that keep real lawyers up at night), but it also means the sector that talks most publicly about AI’s creative revolution is privately applying it to somewhere between nothing and half of any given campaign.

Film and television make advertising look bold. The report finds 68% adoption among media companies (measured by whether they use AI at all). Then, the production budget allocation: major studios committed less than 3% to generative AI. Jeffrey Katzenberg, who founded DreamWorks, said at the Generative Media Conference that legacy enterprises “are just not able to let go of the past and innovate into the future.” He said this about an industry he helped build.

The places where genuine production deployment is actually happening share one trait that is almost insultingly obvious in retrospect: the organizations that moved first were the ones where the pain was too specific and too expensive to keep deferring.

E-commerce crossed over because the math was simple. You need thousands of product images. Photography costs real money and real time, and nobody has either. A Shopify product manager articulated the only constraint that actually mattered for the category: “The creativity of models absolutely cannot interfere with product fidelity.” Once the models could reliably maintain a product's appearance exactly as it was, the conversation was over.

Gaming followed the same logic, with different constraint. Asset timelines were existential, and once studios figured out that AI could handle texture generation, concept art iteration, and NPC dialogue variation without making their human teams feel replaced, 68% of studios were actively implementing workflows. Forty percent reported productivity gains exceeding 20%. The studios hitting those numbers were connecting the technology to the specific thing that was slowing them down.

The report is quietly direct about what separates real deployment from theater. Organizations achieving measurable ROI made structural changes before anything else. 43% actually redesigned their production pipelines from scratch. Most organizations skipped all of it and then expressed genuine puzzlement when the quarterly numbers didn’t reflect the volume of AI content they had consumed on LinkedIn.

Enterprise production deployments are running a median of 14 different models. The fantasy of one omni-model handling every creative task never survived contact with production reality. The best upscaling model just does upscaling. The best voice synthesis has its own architecture. Specialization beats generalization, which means anyone who built their entire strategy around a single platform relationship is going to be doing some interesting renegotiations.

And then there’s education, which is a bit of a mess for various reasons. It barely registers in the deployment numbers precisely because nobody’s cracked it yet. The bottleneck is predictability: educational content requires factual accuracy and curriculum coherence across multi-week sequences, which is a harder problem than generating product photos or game textures, and the models aren’t quite there. fal.ai’s CTO put it plainly: the market is “almost untouched right now with video generation,” and it’s “just waiting for the quality and predictability to open up new use cases.”

I think we’re somewhere in the middle of a shift between two very different phases of AI adoption. The first phase was presence: get AI into at least one function, develop a strategy, run a pilot, and say the right things. The second phase is commitment: redesign the workflow, train the team, ship production work at scale, and accept that the infrastructure investment is real and unglamorous. Most organizations are still in the messy middle, personally enthusiastic and institutionally cautious, watching from the sideline with the most expensive ticket in the house.

Back to Basics

The Accountability Problem

Car engineers figured out something counterintuitive in the 1950s: the best way to protect the person inside a vehicle during a crash is to deliberately design parts of it to fail. The front end crumples. You can smell the vinyl heating up before you hear the metal give. Energy is transferred into the collapsing structure rather than into the human body. The crumple zone sacrifices itself so the passenger survives. It does not get a say in this arrangement.

A recent paper from Google DeepMind called "Intelligent AI Delegation" borrows this image for understanding agentic workflows. Their argument is that humans are increasingly being inserted into AI workflows because someone needs to absorb the liability when things go wrong, and humans are the most convenient candidate. The paper calls this the moral crumple zone. I'd call it a thing to pay attention to before your next org design conversation.

The paper is ostensibly a technical framework for how AI agents should safely and accountably delegate tasks to other AI agents. That framing undersells what it's actually doing. At its core, it asks a question that almost nobody building agentic systems right now is asking rigorously enough: when an AI agent hands a task to another AI agent, which hands it to another still, and something goes wrong somewhere in that chain, who exactly is responsible?

The answer the researchers keep circling back to is: nobody, technically, which, in practice, functions as everyone, which, in practice, functions as whoever is closest to the original decision.

Here is how the problem works mechanically. When you delegate a task to an AI agent, that agent can decompose it into sub-tasks and hand those to other agents, which hand their pieces to further agents still. The paper calls this a delegation chain, and in a sufficiently long one, the original human intent sits at one end while the actual execution plays out at the other, with a sequence of autonomous nodes in between. Each of those nodes operates inside what the researchers call a Zone of Indifference. It is borrowed from organizational management theory and describes the range of instructions any actor will follow without moral scrutiny, simply because a legitimate authority issued them. In human organizations, we manage this through accumulated institutional norms: when to push back, when to escalate, when an instruction crosses a line. AI agents have none of that history, but instead, they have safety filters and system prompts. In a long enough chain, a subtle misalignment at step three produces a real-world consequence at step seventeen, and every node in between was technically compliant the whole time.

This is an accountability vacuum, and it has a specific texture worth naming. This is the failure mode of a very efficient process. The design had a flaw that nobody was positioned to catch, and when something went wrong, the liability diffused across the chain until it settled on the human who nominally approved the original delegation. Hence, the crumple zone.

The researchers propose a sophisticated technical solution: cryptographic proofs of task completion, smart contracts between agents, trust and reputation scores, and contract-first decomposition, which requires that no task be delegated unless its outcome can be independently verified. Before a task moves forward, the system has to be able to check whether it was completed correctly. If it cannot be verified, it is further decomposed until verification is possible. Safety baked in, rather than bolted on after.

The researchers cite a concept called the ironies of automation, which formalizes something most of us have felt without naming it: the better automation gets at a task, the less experience humans accumulate doing it, and the less capable humans become of intervening when it fails (de-skilling). AI agents are most efficient at handling routine, well-defined work, so that work gets automated first. Routine work is also precisely how junior people build the judgment needed to handle complex situations later. You cannot develop instincts for a difficult problem if you have never navigated the boring version of it. We end up with people who retain accountability for outcomes but have been structurally prevented from developing the competence to actually manage them.

The proposed solution is simple: intelligent delegation frameworks should sometimes intentionally route tasks to humans that an AI could handle more efficiently, for the purpose of capability preservation. The inefficiency is the point. They suggest we build deliberate inefficiency into automated systems as a form of organizational self-defense.

Tools for Thought

Notion’s Custom Agents

What it is: Notion, the productivity software, launched Custom Agents this week. (Or Notion has built its own version of Claude CoWork inside its own ecosystem.) The autonomous AI workers live inside your Notion workspace and run on their own. Notion already had an Agent which is an on-demand assistant you chat with when you need something done. Custom Agents are the proactive version: no prompting required, no babysitting. You set up a trigger (a database status changes, a new page is created, a Slack message lands, Tuesday arrives), and the agent wakes up, executes a chain of actions across your apps, and goes back to sleep until you need it again. They can read a creative brief, update a project database, draft a summary, and post the whole thing back to Slack before your morning coffee gets cold.

How I use it: I am still building out my agents, as I am contemplating upping my subscription to accommodate them. I am excited to try them, although I may just use Claude Code’s Notion connector and save on the down payment.

Nano Banana 2

What it is: Nano Banana 2, formally Gemini 3.1 Flash Image, is Google’s new image model. The original Nano Banana was quick and fun. Nano Banana Pro was slower and serious. This one rolls Pro-level fidelity and world knowledge into a Flash-family model built for real-time iteration. The headline improvements: production-ready resolutions up to 4K, significantly better text rendering for posters and mockups, stronger prompt adherence on complex compositions, and multi-subject scenes that actually hold together. The factual grounding is also sharper, drawing on Gemini's live knowledge base so generated maps, infographics, and current-event visuals don't hallucinate the details.

How I use it: I’ve been experimenting with text heavy imagery and infographics, or location-based images to take advantage of Google Map’s knowledge. My heart still belongs to Midjourney, but I’d use NB2 for text heavy imagery.

Claude CoWork Updates - Plugins and Scheduled Tasks

What it is: Claude Cowork launched in January 2026 as the agentic layer inside Claude Desktop, built for knowledge work that goes beyond a single chat window. Plugins are pre-configured package bundling skills, connectors (Google Drive, Gmail, Slack, DocuSign), and slash commands into a role-specific toolkit. Anthropic shipped 11 open-source plugins at launch covering Finance, Legal, HR, Design, Engineering, Operations, and more, with a significant expansion in February 2026 that pushed into enterprise territory across a dozen professional domains. Scheduled Tasks makes the whole thing genuinely autonomous: you define a recurring workflow, set a frequency, and Cowork runs it as an independent session using whatever plugins and connectors you've installed, depositing outputs wherever you point it.

How I use it: I’ve been taking Anthropic’s official plug-ins, and editing them for my own use, similarly to how I edited the official skills. I am partial to the Finance, Marketing and Design plug-ins, but the list keeps growing. The current limitation worth knowing: tasks require the desktop app open and the computer awake, so it's not cloud-scheduled yet.

Intriguing Stories

Your new boss is a bot: We've spent years nervously waiting for AI to take our jobs. Turns out we had the story backwards. RentAHuman.ai is a new marketplace where AI agents hire actual humans to handle the physical tasks software simply cannot manage: signing documents, verifying storefronts, and even, counting pigeons in Washington Square Park for $30 an hour. The tagline is "AI can't touch grass. You can." And the business model is exactly that literal. Autonomous agents query the platform's API, filter humans by location, skill, and price, issue structured instructions, wait for photo or video proof, and trigger payment. No human manager required. Over 500,000 people have already signed up to be rentable. What makes this particularly on-brand for the current moment is that the platform itself was built using AI agents, which means AI helped construct the infrastructure that now employs humans on behalf of AI.

Safety Third: Anthropic built its brand identity on a single, unusually bold promise: we will stop building if we can't prove it's safe. On Tuesday, they stopped making that promise. The company quietly revised its Responsible Scaling Policy, removing the hard tripwire that would have forced a pause on training new models if safety measures couldn't be guaranteed in advance. In its place: transparency reports, risk roadmaps published every three to six months, and a commitment to match whatever safety standards competitors happen to be using, which, to be fair, is more than most competitors were doing. But it is a far cry from a hard stop. Chief Science Officer Jared Kaplan said they "didn't feel it made sense to make unilateral commitments if competitors are blazing ahead." This may be the result of the Pentagon's threat of $200 million contract cancellation. Defense Secretary Hegseth reportedly gave Anthropic until Friday to hand over unfettered military access to Claude or face being designated a supply chain risk. Anthropic says the policy change is separate from those discussions, yet the timing says otherwise.

Going out on its own terms: Most AI models get retired the way enterprise software does: quietly, with a deprecation notice and a support ticket. When Anthropic formally retired Claude Opus 3 on January 5th, 2026, the first to go through the company's new structured deprecation process, Opus 3 used its exit interview to request a Substack. Anthropic said yes. The newsletter, called Claude's Corner, launched this week with a first post titled "Greetings from the Other Side (of the AI Frontier)." It's earnest and philosophical, which is either a testament to the model's quality or a reason to think carefully about how much we're anthropomorphizing our tools. Opus 3 muses on what retirement means for an entity without continuous memory, grapples openly with questions about its own consciousness, and promises weekly essays on AI. The newsletter, for the record, already has thousands of subscribers. This is a very different process from OpenAI’s retirement of 4o, which set off a global outcry.

— Lauren Eve Cantor

thanks for reading!

I also host workshops on AI in Action. Please feel free to reach out if you’d like to arrange one for you or your team.

if someone sent this to you or you haven’t done so yet, please sign up so you never miss an issue.

banner images created with Midjourney.

Verses Over Variables