Google AI Overview keeps stumbling on letter counting

AIMay 28, 2026 at 06:00 PM

Read full story

Source: PC Gamer

TLDR: Google’s upgraded AI Overview can still miscount letters in word queries, like answering “2 Ps in Google,” then wobbling or changing later. This highlights LLM limits that now show up at the top of everyday search results.

Key Takeaways:

Google expanded AI Overview to deliver more conversational, LLM generated answers inside search results, pushing model output into users most common queries.
A query like “How many Ps are in Google” can trigger an incorrect letter count and even unrelated number guesses, with similar spelling failures reported for “enigmatic.”
Because LLMs encode text as tokens instead of reading letters, “counting within words” remains hard, and Google may limit AI Overview on specific prompts while it fixes behavior.

It is funny until it is not: the AI feels confident, but the confidence is really a token prediction party happening on the user screen. When the top box turns wrong, it turns trust into a beta feature.

No comments yet. Be the first to share your thoughts!

Q&A

If LLMs do not truly read letters, why does “counting letters” fail so consistently even when the spelling seems mostly correct?

Letter level counting depends on reliably mapping characters to a sequence, but transformer models operate on token patterns learned from text. They can reproduce the look of correctness while losing exact character level accounting.

What does Google likely change when it says it is working to fix a “known challenge” for AI Overview?

Google can adjust prompt handling, add guardrails for specific question types, retrain or route requests to different model behavior, and temporarily throttle AI Overview for letter counting style queries.

Why might the wrong answer disappear in one browser but persist in another?

AI Overview behavior can vary by model version, caching, feature flags, and experiment rollout. Different clients can receive different versions of ranking, safety filters, or response templates.

What happens to user behavior when AI summaries sit above the actual links and facts?

Users may accept the AI box as a first authority, reducing clickthrough and increasing the impact of small factual slips. That can also slow feedback loops that would otherwise correct errors quickly.

How could Google verify fixes for letter counting without relying on subjective judgments of “confidence”?

It can build automated test suites for exact string tasks, add evaluation benchmarks for character level tasks, and compare outputs against ground truth for controlled prompts before and during rollouts.

Google AI Overview keeps stumbling on letter counting

Key Takeaways:

Q&A

If LLMs do not truly read letters, why does “counting letters” fail so consistently even when the spelling seems mostly correct?

What does Google likely change when it says it is working to fix a “known challenge” for AI Overview?

Why might the wrong answer disappear in one browser but persist in another?

What happens to user behavior when AI summaries sit above the actual links and facts?

How could Google verify fixes for letter counting without relying on subjective judgments of “confidence”?

Top in AI

iPhone local AI chatbots turn privacy into a menu choice

YouTube adds Gemini custom feeds, users shape discovery

Bitrix24 Copilot turns SME CRM into an AI workforce

Frontier LLMs fracture on fact-check verdicts 67% of times

Claude Opus 4.8 ramps agent reliability while pricing stays put