Published 2 months ago

Inside Google’s AI Search Overhaul: Tokenization, Spelling Fails and Trust Problems

Google can’t spell its own name.

According to Google’s AI Overview, the word “Google” contains two P’s. There is “exactly 1 ‘r’ in the word ‘poop.’” The word “journalism” apparently has two D’s — spelled out as j-o-u-r-n-a-d-i-s-m. And the sitting U.S. president’s last name? t-r-p-u-m.

These aren’t edge cases buried in stress tests. These are live results from the most-used search engine on the planet, now powered by generative AI.

402

5 mins read

8 sections

Key Highlights

Google’s AI Overviews expose how LLM tokenization breaks basic spelling tasks.
Visible AI search errors deepen the trust gap just as Google makes AI the default.
AI tool buyers must test specific capabilities and add verification layers by design.

Google’s AI-First Bet Is Already Stumbling

This isn’t Google’s first rodeo with AI Overview embarrassments. The first rollout saw the feature citing satirical Onion articles and Reddit posts — advising users to eat rocks and put glue on pizza. That was bad. This is different.

Google is doubling down. Generative AI is now the centerpiece of a 29-year-old flagship product that billions of people rely on daily. The stakes are higher, the scrutiny is sharper, and the failures are more visible.

Last week, searching the word “disregard” returned what looked like a dictionary definition — except the definition read: “Understood. Let me know whenever you have a new prompt or question!” Google patched that one quickly. The spelling errors, however, have proven far more stubborn.

“Counting within words has been a known challenge for LLMs, and we’re working to fix this particular issue,” Google told TechCrunch.

That’s a careful way of saying: we don’t have a clean solution yet.

Why AI Can’t Spell — And It’s Not a Bug You Can Just Patch

Here’s the thing most people don’t realize: LLMs were never designed to spell.

The running joke in AI circles is that whenever a new model drops, you ask it how many R’s are in “strawberry.” Models that can write production-ready code, solve advanced mathematics, and synthesize research papers will fumble that question like a kindergartener.

The reason goes deeper than a software glitch.

Tokens, Not Letters

Most large language models are built on transformer architectures. These models don’t read text the way humans do — letter by letter, word by word. Instead, they break language into tokens, which can be full words, partial words, syllables, or individual characters depending on the model.

When a prompt enters the system, it gets converted into numerical encodings — mathematical representations of meaning and context. The model then predicts the most statistically logical response based on those encodings.

“When it sees the word ‘the,’ it has this one encoding of what ‘the’ means, but it does not know about ‘T,’ ‘H,’ ‘E,’” explained Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta.

That’s the core issue. The model doesn’t experience letters as discrete units. It experiences compressed semantic chunks. Asking it to count letters is like asking someone to describe a painting they’ve only ever seen described in a book — the information is approximate, not precise.

The Tokenizer Problem Has No Clean Fix

Even if researchers wanted to redesign tokenization from scratch, the problem doesn’t disappear.

“My guess would be that there’s no such thing as a perfect tokenizer due to this kind of fuzziness,” said Sheridan Feucht, a PhD student studying LLM interpretability at Northeastern University. “Even if we got human experts to agree on a perfect token vocabulary, models would probably still find it useful to ‘chunk’ things even further.”

This isn’t a bug waiting for a patch. It’s an architectural constraint baked into how these systems process language at a fundamental level.

What This Actually Means for AI Tools Buyers and Builders

Spelling errors are easy to laugh at. But they point to something more important for anyone evaluating AI tools right now.

AI systems have hard ceilings — and those ceilings aren’t always obvious.

A tool can perform brilliantly on complex reasoning tasks and fail completely on something a ten-year-old handles effortlessly. That asymmetry is disorienting. It makes AI feel unpredictable, which erodes trust — especially in high-stakes workflows.

For founders and marketers integrating AI into their products or pipelines, this matters practically:

Don’t assume capability generalizes. A model that excels at summarization may hallucinate on character-level tasks. Test specifically for what you need.
Build verification layers. AI outputs in customer-facing contexts need human review or automated validation — not as a precaution, but as a baseline requirement.
Understand the architecture before you trust the output. Tokenization, training data, and model design all shape what a tool can and cannot do reliably.

The Trust Gap Is the Real Problem

Google’s spelling failures aren’t an existential crisis for AI. Researchers themselves acknowledge that spelling accuracy isn’t where LLM utility lives.

But these visible, public failures do something damaging: they remind users that AI is not an all-knowing oracle. And right now, that reminder is arriving at exactly the moment Google is asking billions of people to trust AI Overviews as their primary source of information.

That’s a difficult ask when the system can’t correctly spell the name of the company that built it.

The Takeaway

AI tools are genuinely powerful. They’re also genuinely limited — and the limitations aren’t always where you’d expect them.

The smarter move isn’t to dismiss generative AI because it misspells “journalism.” It’s to understand why it misspells “journalism,” so you can deploy these tools where they actually deliver value and build safeguards where they don’t.

Observe the tool. Understand the architecture. Choose smarter.

Lukas_M

Published 9 articles across Trend Analysis, Insights, AI Use Cases, News, and Explainer since May 2026.

Key Highlights

Google’s AI-First Bet Is Already Stumbling

Why AI Can’t Spell — And It’s Not a Bug You Can Just Patch

Tokens, Not Letters

The Tokenizer Problem Has No Clean Fix

What This Actually Means for AI Tools Buyers and Builders

The Trust Gap Is the Real Problem

The Takeaway

Lukas_M

Related · Content

Google Gemini 3.5 Pro Delay: What the Missing Model Means for the AI Race

Google’s AI Search Tools Deemed ‘Unacceptable’ for Kids: What Educators Need to Know

Why General AI Tools Fail at Professional Tax Research (And What To Use Instead)

Netflix Expands Generative AI Across 300 Productions and Its Streaming App

Comments (0) No comments yet

Related · Tools

AI Outpainting Image

CalStudio

RegiAI

Docufai

FAQ-Bot

ChatGPT