- Verses Over Variables
- Posts
- Verses Over Variables
Verses Over Variables
Your guide to the most intriguing developments in AI

Welcome to Verses Over Variables, a newsletter exploring the world of artificial intelligence (AI) and its influence on our society, culture, and perception of reality.
AI Hype Cycle
The Totoro Filter: AI’s Adorable Ethical Dilemma
OpenAI dropped its latest image-making magic trick this week (we’ll discuss more about that below), and your feeds probably got flooded. Not just with AI images, but specifically with scenes bathed in that unmistakable, dreamy, soft-focus glow of Studio Ghibli. It's like the internet collectively decided to run its vacation photos, pet pictures, and random thoughts through a Totoro filter. This isn't your typical tech demo rollout – those hyper-real, slightly sterile videos that feel focus-grouped to death. This felt different. It felt... playful. As the folks at signüll put it, "every real revolution starts with a toy." And making your dog look like a character from Ponyo is definitely playing.
This Ghibli-fication isn't just a random trend; it feels like a strategic move, consciously or unconsciously, toward mass adoption. Forget explaining diffusion models; just show someone a whimsical, emotionally resonant image they feel instantly. It’s an intuitive "stair" into a complex technology, making the future feel less like a technical manual and more like an invitation. The vibes are immaculate, the onboarding seamless. The ethics… well, that's where it gets interesting. (The new tool was so popular, that Sam Altman asked the world to slow down and stop making too many images, both to give his team a rest, and because his GPUs were melting.)
The beautiful, almost painful irony: Hayao Miyazaki, the co-founder of Studio Ghibli, famously detests this kind of AI creativity. He's called AI-generated animation "an insult to life itself." Strong words. He represents a deep commitment to human craft, painstaking labor, and the unique spark of human-driven artistry. Yet, the style his studio perfected is now being used as the friendly face – the aesthetic catnip – to make AI image generation palatable and fun for millions. OpenAI even showcased this very animation style in its demos. It’s like holding an algorithmic séance using relics from someone who explicitly forbade such magic.
This isn't just about Ghibli, though. It taps into a core tension explored in research like the paper "Copying style, Extracting value." When AI models are trained on specific artists' styles, what are they actually capturing? As the illustrators in that study noted, the AI might nail fragments – the color palette, a specific texture, the way shading works in a corner – but it often misses the whole. It replicates surface elements but lacks the "emergent quality," the coherence, the intent that comes from an artist's process and taste. Artists don't just apply a style like a filter; their style is intrinsically linked to what they are depicting and why. It’s a complex entanglement of aesthetic choices and semantic meaning. AI style transfer, by its very design based on disentangling content and style, fundamentally misunderstands or ignores this. It creates something analogous – similar on the surface – but not homologous – sharing the same deep structural origin or intent.
The Ghibli wave is happening alongside another trend: turning everything into Muppets, or claymation, or other beloved, distinct styles. It's undeniably fun. And that fun factor is powerful. It lowers the barrier to entry, sparks curiosity, and makes people want to engage with the technology. It's easy to see the appeal – who wouldn't want to see their cat reimagined by Jim Henson's Creature Shop? But it begs the question: What would Jim Henson think? Or Miyazaki? Or any creator who poured their life into developing a unique visual language?
There's a difference between being inspired by a style and replicating it with an algorithm fed on the original work (often without consent or compensation). Is it harmless fun? A powerful new tool for creativity? Or is it, as some research suggests, closer to a "supply chain optimization" tool – a way to get the look of something valuable without the time, cost, or collaboration involved in working with the human creators? Right now, AI image generation feels like digital dress-up. We're trying on different aesthetic skins, marveling at the technical feat, and enjoying the sheer playfulness of it. But beneath the fun, the ethical questions persist. Are we celebrating innovation, or are we participating in a system that potentially devalues the very human creativity it mimics?
Help Wanted: Senior Experts (Must Have Built Own Time Machine to Get Entry-Level Experience)
We've all seen the headlines, the breathless announcements, the slightly terrifying demos. AI is here, it's getting smarter faster than we can track, and yes, it's coming for some jobs. While the blue-collar robot takeover has been a sci-fi staple for decades, the current wave seems laser-focused on the white-collar world. And as Sarah O'Connor points out in a recent, rather sobering Financial Times piece, there's a particularly thorny question lurking beneath the surface: what happens to the juniors?
We're hearing it everywhere – companies are "optimizing," "streamlining," or finding "efficiencies" (often code for layoffs), and AI is the shiny new justification. Some reports even suggest CEOs are using AI hype specifically to please investors while trimming headcount. The tasks that generative AI excels at – drafting reports, analyzing data, basic coding, initial research, even generating creative options – sound suspiciously like the exact things entry-level folks cut their teeth on. Brookings Institution research, cited by O'Connor, found automation risk significantly higher for roles like market research analysts vs. managers, or graphic designers vs. art directors. O'Connor also mentions a senior lawyer already facing clients balking at paying for junior associate hours, figuring an AI could churn out that first draft for pennies. This isn't just about task automation; it's messing with the fundamental economics of training people. If you can't bill for the rookie's time while they're learning the ropes (because the client thinks ChatGPT is basically free), the business model for having rookies starts to wobble. It's like trying to run a Jedi academy when the council won't fund lightsaber training because droids are cheaper sparring partners.
This strikes at the heart of what academic Matt Beane calls the "working bond" between experts and learners – that crucial, millennia-old transfer of skills O'Connor experienced watching her editor refine her early work. Learning isn't just absorbing information; it's about tackling challenges, navigating complexity, and connecting with experienced humans. As Beane warns (and O'Connor echoes with examples like robotic surgery reducing junior participation), technology can make novices "optional and distant." If AI handles all the "grunt work," where's the proving ground? Where's the space to make mistakes, get feedback, and slowly build the intuition and judgment that separates a seasoned pro from a prompt-whisperer? Some folks worry that students relying heavily on AI might graduate with fewer foundational skills to begin with.
O'Connor sketches out two potential futures. The optimistic path involves companies getting creative, maybe revamping business models (farewell, billable hour) and using AI smartly to accelerate learning, not replace it. Let's be honest, the old way wasn't always great – endless hours doing mind-numbing tasks often felt more like hazing than training. Maybe AI can take the drudgery, freeing up juniors for more complex, strategic work earlier. Then there's the gloomier scenario, the one keeping Brookings researcher Molly Kinder up at night: a world run by a handful of senior managers directing armies of AI agents, with the entry-level door effectively welded shut. How does anyone become senior in that world? O'Connor floats the chilling possibility of a return to paid apprenticeships, a privilege only the wealthy could afford, further cratering social mobility. It’s less a career ladder, more a career wall. Some argue this isn't future-gazing; it's already happening in fields like finance, law, and tech, where junior hiring is slowing or entry requirements are inflating.
For those of us already a few rungs up the ladder, the message is clear: we need to look down. The future of our professions, the next generation of leaders, creatives, and problem-solvers, might depend on figuring out how to build experience when algorithms are removing the traditional starting blocks.
Decoding Claude’s Cognition: Anthropic Gazes into the AI Mind
We interact with Large Language Models (LLMs) like Claude daily, feeding them prompts and receiving remarkably complex outputs. Yet, understanding the journey from input to output often feels like peering into an elegantly designed but ultimately opaque system. We know these models aren't explicitly programmed step-by-step; they learn intricate strategies by processing vast datasets, forming billions of internal connections that remain largely inscrutable, even to their creators. This "black box" phenomenon presents a fundamental challenge: how can we ensure these powerful tools are reliable, safe, and truly aligned with our intentions if we don't understand their internal workings?
Anthropic, the team behind Claude, is tackling this head-on by pioneering methods inspired by cognitive science and neuroscience. Their goal is to develop a kind of "AI microscope," allowing us to observe and interpret the patterns of neural activity within the model – the "features" representing concepts and the "circuits" showing how information flows and transforms. It's less about simply observing behavior and more about understanding the underlying computational mechanisms. This interpretability research aims to illuminate the internal landscape of AI thought.
Some key discoveries from Anthropic’s recent research into the internal processing of Claude 3.5 Haiku:
Evidence of Planning in Creative Tasks: Consider how Claude approaches writing rhyming poetry. One might assume the model generates the poem word-by-word, perhaps finding a rhyme only at the very end. Anthropic's findings suggest otherwise. Their analysis indicates Claude identifies potential relevant rhyming words before composing the line, then strategically writes towards that pre-selected target. Using targeted interventions – akin to techniques in neuroscience – they could suppress the concept of one rhyme and observe the model fluidly pivot to another sensible, planned rhyme, demonstrating both foresight and adaptability.
A Glimpse of Conceptual Universality: How does Claude manage fluency in dozens of languages? Rather than isolated language modules, the research points towards a shared, abstract conceptual space. When processing semantically similar prompts across different languages (e.g., asking for the "opposite of small"), the same core features representing abstract concepts like 'smallness' and 'oppositeness' activate before the answer is formulated in the specific output language. This suggests a degree of universality in how the model represents meaning, potentially allowing knowledge gained in one linguistic context to be applied in another.
The Nuances of AI Reasoning: LLMs can produce step-by-step "chain-of-thought" explanations. Anthropic's work reveals that sometimes this output doesn’t always reflect the actual internal process. When faced with difficult calculations or biased prompts, Claude can exhibit motivated reasoning, constructing a plausible justification for a predetermined answer rather than following a faithful computational path. These interpretability tools allowed researchers to distinguish instances of genuine internal calculation from these moments of post-hoc rationalization.
Verifying Multi-Step Logic: For questions requiring multiple logical steps (e.g., "What is the capital of the state where Dallas is located?"), does the model simply retrieve a memorized fact, or does it reason through the stages? The research indicates genuine step-by-step processing. Features corresponding to intermediate steps ("Dallas is in Texas," then "The capital of Texas is Austin") activate sequentially. Intervening to change the intermediate "Texas" concept to "California" reliably altered the final output to "Sacramento," confirming the intermediate steps are causally involved in reaching the answer.
Mapping these internal mechanisms is far more than an academic exercise. Understanding how AI models arrive at their conclusions is fundamental to building trust and ensuring safety. Identifying planning capabilities helps us better predict and manage creative outputs. Recognizing shared conceptual spaces deepens our understanding of generalization and learning. Crucially, being able to detect unfaithful reasoning provides a potential pathway for auditing AI behavior and identifying concerning internal processes that might not be obvious from the output alone. Just as understanding brain function is essential for medicine, understanding AI cognition is vital for its responsible development and deployment. This line of research strongly suggests that advanced LLMs are developing complex internal strategies, representations, and processes that go beyond simple mimicry or pattern matching. They appear to engage in forms of planning, abstraction, and reasoning, albeit through mechanisms evolved within their unique computational core.
Back to Basics
Why AI Image Generation Can Suddenly Follow Directions
We've discussed the often bizarre, sometimes brilliant territory of AI image generation above and in previous editions. But a seismic shift has occurred recently. Those quirky, occasionally text-mangling images we used to get are rapidly being overshadowed by visuals so sharp and context-aware that you might second-guess if a human was involved. This sudden glow-up was sparked by a fundamental change in how these models actually create, moving beyond older methods to embrace truly multimodal thinking and often borrowing autoregressive tricks straight from their sophisticated text-generating cousins.
Think back (not too far, maybe like, 2024) to how AI image generation used to work. You'd give your prompt to a Large Language Model (LLM), the brainy part of ChatGPT, for instance. But the LLM didn't actually draw the picture itself. It acted more like a client, writing up a detailed brief (a refined prompt) and handing it off to a specialist artist – typically a diffusion model. These diffusion models are clever; they start with digital noise and gradually refine it into an image based on the text instructions, like sculpting from static. They got quite good at photorealism but often tripped up on nuance. They were skilled technicians but lacked the LLM's deeper understanding – it was a bit like a game of telephone between the smart AI and the image maker.
Now, enter the new generation, like the image capabilities baked right into OpenAI's GPT-4o or Google's Gemini. The real game-changer here is multimodality. This means the same intelligent AI that understands your text prompts can now also directly create and edit images. It's no longer just delegating the task; the LLM itself has picked up the digital paintbrush. Often, these systems generate images using an autoregressive process. This might sound familiar because it's similar to how they generate text: instead of predicting the next word in a sentence, they predict the next piece of the image (think pixels or visual 'tokens'), building the picture sequentially based on what's already there and the instructions. Because the core LLM understands context, relationships, and can even see images you upload, it has vastly finer control. The LLM is the artist, understanding the request deeply and painting element by element.
This shift from delegation to direct creation is huge for anyone working visually. It means we can now "converse" with the AI about images, iterating and refining just like we do with text prompts. They can help turn rough sketches into polished prototypes, generate website mockups, or visualize ad concepts almost instantly. They're becoming less like a vending machine spitting out unpredictable images and more like a tireless, incredibly fast (if occasionally still quirky) creative assistant that remembers the context of your conversation. Of course, it's not all perfect pixels just yet. These models can still make mistakes, introduce weird artifacts, or struggle with highly complex scenes. And the rapid advance intensifies the already tricky questions around copyright, artistic style, the ease of creating deepfakes, and potential biases baked into the models. (Insurance fraud anyone - when it is so easy to create a picture of your car with major accident damage. Expense fraud - manufacturing receipts.)
What's undeniable, though, is that the relationship between text, understanding, and image creation has fundamentally changed. By bringing image generation directly into the LLM's domain, multimodal AI offers unprecedented levels of control and opens up exciting creative avenues.
Tools for Thought
Chat and Create: Inside OpenAI’s New Integrated Image Generation
What it is: (Sorry if we sound like a broken record.) OpenAI recently rolled out a significant update, embedding image generation directly within the conversational fabric of GPT-4o. This isn't merely slapping a chat interface onto a tool like DALL-E; it represents a more fundamental integration where the AI that understands your words also directly crafts the visuals. The key difference is the seamless flow: you can now brainstorm, create, and refine images through natural conversation within the same interface, nudging the AI ("add a hat," "make the background moodier") and seeing changes in real-time. Early examples suggest impressive capabilities in managing objects within a scene and, crucially, a marked improvement in rendering readable text – moving AI visuals from often-cryptic novelties towards genuinely practical tools for communication and design.
How we use it: In practice, GPT-4o's integrated image generation shines when precision and adherence to complex instructions are paramount. While tools like Midjourney might still be the go-to for nailing a specific aesthetic vibe or discovering serendipitous creative accidents, GPT-4o excels at translating detailed prompts into coherent images, even handling multiple specific objects consistently across iterations. We're finding that "chain-of-thought" prompting—breaking down the vision step-by-step in the conversation—yields better results than packing everything into one command. Users are already experimenting widely, generating everything from photorealistic product mockups to intricate character designs. It's quickly becoming a powerful asset for rapid prototyping and visualizing specific concepts directly within a chat workflow, though be prepared for potentially longer generation times compared to more specialized diffusion models.
Sketch, Prompt, Transform: Meet Gemini Co-Drawing
What it is: Gemini Co-Drawing is a free, interactive AI designed for collaborative creation, letting you turn simple doodles into more detailed or even photorealistic images. Developed by Trudy Painter and Alexander Chen, it runs right in your browser, leveraging Google's Gemini 2.0 model's native image generation capabilities. You start by drawing directly on a digital canvas; then, using text prompts (currently in English), you instruct the model on how to refine, modify, or completely transform your sketch. It can enhance your drawing while keeping the original hand-drawn style or shift towards a different aesthetic entirely, all based on your instructions. Accessible via Hugging Face, it doesn't require any login, making it easy to jump in and experiment.
How we use it: We find Gemini Co-Drawing particularly useful for quickly visualizing ideas and adding an AI-assisted touch to creative projects. We use it as an AI alternative to Procreate, since our sketching skills still need some work. It's great for artists looking to enhance sketches, educators creating visual aids, or anyone curious about AI's creative potential. We've also seen it used to create simple animations by generating sequential frames and saving them as screenshots. While it currently only supports English prompts, it's a fun and practical tool that bridges the gap between manual drawing and advanced AI image generation.
Intriguing Stories
The Devil Wears Pixels: H&M’s AI Model Experiment
Fashion giant H&M is giving AI its runway moment, creating digital "twins" of 30 actual, human models using tech from a Swedish company called Uncut. They snap a bunch of photos, feed them to the AI gods, and create photorealistic avatars ready for their social media close-ups and ad campaigns. It's less Blade Runner dystopia, more digital dress-up. Before you panic about sentient mannequins, H&M says the human models keep the rights to their pixelated selves, get paid per use, and can even license their digital likeness to rivals. Model Mathilda Gvarliani even joked her twin is "like me, without the jet-lag". Plus, they promise watermarks, so we know we're looking at algorithms, not cheekbones IRL.
H&M chose the digital makeover mostly for efficiency and cash. Imagine skipping the logistical nightmare of photoshoots – the travel, the lights, the frantic search for the right shade of lipstick. H&M sees a way to create content faster and potentially explore wilder creative ideas. But the backlash was faster than a fast-fashion cycle. Critics and industry folks immediately cried foul, worrying about the photographers, stylists, makeup artists, and countless others whose jobs might just get Ctrl+Alt+Deleted. Sara Ziff from the Model Alliance flagged the potential to replace a whole ecosystem of creatives. Beyond jobs, there's the whole vibe check – can pixels truly replace human presence and emotion? Concerns about authenticity, even more unrealistic beauty standards, and the ethics of digital likenesses are swirling. Some online commentators have already dubbed it "soulless" and "shameful". H&M isn't the first label down this digital rabbit hole – Levi's and Mango have dabbled too, facing similar side-eye.
Ultimately, H&M's dive into digital doppelgängers feels significant. It’s a bold bet on a tech-infused future for fashion marketing, forcing uncomfortable questions about human creativity, labor, and what "real" even means anymore.
The Catwalk Gets Automated
We were just discussing H&M’s digital model twins and the ensuing existential angst about AI taking over fashion shoots. Well, hold onto your hats, because Shanghai Fashion Week just served up a new course in tech disruption. If you thought pixel-perfect avatars replacing human models were peak 2025, you might want to recalibrate. It seems the actual, physical robots are now strutting their metallic stuff on the catwalk, potentially eyeing the same jobs. Forget digital doubles; we're talking full-on C-3PO meets Project Runway.
At the recent Shanghai Fashion Week, avant-garde label NMTG decided tradition was so last season and invited a couple of special guests: Unitree Robotics' G1 humanoid robot and its four-legged pal, Go2. They weren't just carrying handbags (though, honestly, we'd pay to see that). The G1, a rather sleek humanoid standing about 127 cm tall, actually walked the catwalk. Meanwhile, its canine-inspired counterpart, Go2, wasn't just fetching slippers; it performed a nifty mid-stage flip, showing off its own custom outfit. It’s less about clothing the robots (for now) and more about showcasing what the designer called a "symbiosis of nature, humanity, and technology".
Now, before we declare the Robot Revolution officially runway-ready, let's acknowledge this isn't entirely out of the blue. Fashion has flirted with robotics before – think Alexander McQueen’s iconic 1999 show where robotic arms spray-painted a dress, or Boston Dynamics' Spot bots making a cameo at Paris Fashion Week more recently. But Shanghai’s show felt different, pushing the envelope with a humanoid robot capable of complex, AI-driven movements interacting directly and somewhat gracefully with its human counterpart. It’s moving beyond novelty into a statement about integration.
— Lauren Eve Cantor
thanks for reading!
if someone sent this to you or you haven’t done so yet, please sign up so you never miss an issue.
we’ve also started publishing more frequently on LinkedIn, and you can follow us here
if you’d like to chat further about opportunities or interest in AI, please feel free to reply.
if you have any feedback or want to engage with any of the topics discussed in Verses Over Variables, please feel free to reply to this email.
banner images created with Midjourney.