Gemini 3.5 Flash is out and you can't afford it!

Google's latest model might be a sign of things to come re: AI pricing.

May 20, 2026

Hey, y’all -- Sherveen here.

Yesterday, Google revealed Gemini 3.5 Flash, the first release in their latest 3.5 family of models.

It’s fast, like prior Flash versions of Gemini models, and it’s supposedly more agentic -- but in my own testing so far, plus the sentiment of other AI superpower-users online, it’s quite disappointing in practice.

There’s a lot to analyze there, and we’re still waiting on Gemini 3.5 Pro to see Google’s frontier-level intelligence.

In the meantime, another detail about 3.5 Flash is worth paying attention to:

It is three times more expensive than the last version, Gemini 3 Flash. But that’s just the list price for tokens. It’s even worse than that.

We no longer live with the basic prompt-response chatbots of 2022. Now, we use agentic products that allow the models to plan, take many steps, use tools, etc. So, the price isn’t just the list price for tokens -- these newer models can spend a lot of tokens to achieve just one result, and depending on their strength in planning + steps, a seemingly cheaper model (per token) can be more expensive (on the whole).

In other words, ‘sloppier’ models have to think a lot, correct errors, etc.

Artificial Analysis is a company that benchmarks AI models on intelligence, cost, speed, and other factors. They have a useful metric here: the cost to “run all evaluations in the Artificial Analysis Intelligence Index” -- in other words, how much does it cost to get each model through the identical suite of tests, given the models have to think, use tools, and achieve the same end results?

By this measure, 3.5 Flash is more expensive than Opus 4.7 at standard settings, GPT-5.5 Medium, Gemini 3.1 Pro, and Kimi K2.5. And to be clear, I’d rate it below all of these models (thus far).

If you’re using it within the Gemini app, you’re going to be fine, but if you’re using it for code, using it in a third party app, or implementing it inside your own applications, this is a major cost + usage increase without clear benefit.

We’ll have to see where Gemini 3.5 Pro lands, but this doesn’t make me optimistic. This is also happening at the same time that Anthropic continues to crack down on token usage within their subscription plans.

I’ve been talking about this for a while now (read: The Price of (Artificial) Intelligence) -- as consumers, we haven’t reckoned with the unique pricing that comes with ever-larger models and more dynamic applications + usage.

As we reach new tiers in both model size and agentic capability, there’s going to be a bifurcation in who can pay for -- and access -- different levels of AI, from intelligence to speed to reasoning and beyond.

Something worth paying attention to.

Stay frosty,
Sherveen

AI Muscle

Discussion about this post

Ready for more?