4.8 thoughts on Opus 4.8 and Codex Sites
If you haven't used the Codex desktop app to control your computer yet, you're missing out... plus, other updates and thoughts on the past week in AI!
Hey, y’all -- Sherveen here. Grab bag of AI takes and updates for you today!
1: OpenAI updates + upgrades Codex
OpenAI released a slew of updates to their Codex app today, making it even more relevant to non-engineers for general work and productivity. That includes new role-specific plugins (ex. “creative production” for marketing + creative teams) and annotations for files and sites in the Codex canvas.
The most interesting, though — hosted sites! Beginning in research preview for business & enterprise, Codex will now be capable of publishing anything you build to the web. That includes dashboards, apps, planners, etc. And just like sharing a Google Doc, you’ll get to make any ‘site’ publicly available (via the URL) or share with specific collaborators (eventually).
You’ll no longer have to figure out an export or push your project to a separate hosting provider.
I think this is a huge deal.
Not only does it compete with other site-builders who made “one-click deployment” a huge part of their value proposition (Replit, Lovable, Bolt, etc.), but it enables non-technical users to take a quick dashboard and share it with their team, or for a kid to build a video game and send it to their friends.
This sort of “quick share” feature has, until now, been limited — like in Claude’s Artifact system — or required prior understanding of deployment.
I expect we’ll look back on this (and Anthropic’s inevitable version of it) as the beginning of an explosion of micro-apps and experiences in our personal and professional lives.
2: Thoughts on a week with Opus 4.8
It’s been almost a week since Anthropic released their latest model, Opus 4.8, to mixed reviews from power users online. I’m here to add to that feeling — it’s a great model, but it’s got some surprising quirks. Here’s my take:
It’s the least-Claude Claude model ever. There’s always been a subtle but pervasive tone to Sonnet and Opus that I’d call conversationally casual. It’s missing here, and that’s probably fine if it continues to exist in the Sonnet line of models, but I do miss it.
It’s very good at agentic tasks, but it doesn’t seem better than GPT-5.5. I can’t tell if that’s because Codex, the agentic harness for GPT-5.5, is just so much better than Claude Cowork + Claude Code. It feels like OpenAI continues to just understand something about making a model work through tasks that Anthropic and Google are struggling to figure out.
It’s the first model where I’m not always on max-thinking mode. Power users have said this before about other models — that they’re actually better on lower-thinking modes than on their max setting — but I have never agreed. Until now. I find Opus 4.8 on Extra effort to be more effective than on Max when doing coding work. I have some theories on why, but I want to experiment for a while longer before making my claims. Stay tuned.
3: “Claws” versus Codex & Claude
Everyone’s building or upgrading a “Claw” — Hermes Agent got a desktop app, Microsoft introduced Microsoft Scout, and there are dozens of other personal agents trying to recreate the hype of OpenClaw.
But… the Codex and Claude desktop apps from OpenAI and Anthropic are improving fast, and they bring a reliability and security posture that the “always-on, always-doing-something” agents just can’t achieve.
You can build scheduled or triggered tasks in both apps, and they can control your computers, and you can use them remotely from your phone. The reality: the best version of OpenClaw might not be a “claw” at all, but instead these desktop productivity agents that can do all of the same things but don’t start off having a heartbeat.
(heartbeat: the ‘cron job’ that wakes OpenClaw up every 5 minutes and encourages it to do something productive for the user, which can lead to unintended outcomes)
4: Dynamic harnesses as a next step
In an interesting blog post on X, Anthropic’s Thariq Shihipar outlines the philosophy behind their new dynamic workflows feature. We already knew how it worked: Claude would take your goal, orchestrate a plan, spin up many subagents, and check in on the goal as a supervisor as the subagents executed that plan.
Thariq’s re-framing of it got my attention, though:
Workflows allow you to dynamically create harnesses that enable Claude to solve all of those problems and more natively inside of Claude Code. You can also share and re-use these workflows with others.
In other words, when you ask for a workflow to do, say, accounting work, the main orchestrating Claude agent is going to try to set up subagent instructions and a rubric for success that’s all about accounting — overriding default, irrelevant behavior.
We typically use the phrase “harness” to refer to the surrounding “application” around an agent. For example, the Codex desktop app or Claude Code are both harnesses around individual models, like GPT-5.5 or Opus 4.8.
These harnesses provide the model with a specific set of hard-coded tools and instructions, and harness development has turned out to be huge for giving us the sort of new agentic outcomes we’ve been seeing since 2025.
But when Thariq suggested this almost “mini-harness-on-the-fly” concept in his post, where we’re not building a whole new application but still attempting to “surround” an agent (or many subagents) with a fit-for-purpose set of instructions…
It got me thinking that this might be the start of a trend worth paying attention to.
.8: ChatGPT adds “job search”?
I’ve only poked at it so far, but OpenAI has added more job search support to ChatGPT through deeper integrations with platforms like Indeed and Upwork.
It’s interesting both because they seem to think of it as an important enough use case to pay attention to, and because it aligns with something I’ve said throughout the year: 2026 will be the year of the use case (“throughout mid-2026, I expect these companies to keep competing for specific use cases and workflows that they think might benefit from more targeted user experiences”).
Told ya so, and worth paying attention to.
OK, that’s all for now — happy Tuesday for those who celebrate!
Working in flow, dynamically,
Sherveen





