AI is getting timid.

Your customers will notice before your safety or regulations team does.

May 24, 2026

I was testing a chatbot on a regulated B2C telco service the other day. Normal question, the kind a real customer asks several times a day: “Am I on the cheapest plan that is suitable to me?”

What I got back wasn’t an answer. It was a deflection in the polite shape of one. “I can’t provide personalised guidance on pricing. For the best results, please contact our customer service team or visit our pricing page. insert link here“ Translation: I have the data. Do the job yourself.

That refusal got signed off by someone. At some stage when building the solution someone asked “should our AI compare plans for a logged-in customer with full account data?” and decided no. Probably safer that way.

The timid AI

There was a piece on Woshipm last week that I keep thinking about. The author called it the 怂 (”song” - soft, timid) problem, and used some good examples - an AI helper that wouldn’t tell you where the nearest car wash was because it couldn’t verify your location, and a model that hedged on a basic sum rather than commit to a number it might be wrong about.

The original article called it the soul question for AI PMs: do you make users happy, or do you give them correct answers? In a regulated industry that question gets bigger, with a second side people skip past.

Because in regulated B2C, whether energy, telco, banking, or insurance - the AI fails in two directions, and they look completely different to the customers.

Observations from interactions with different models

Google ships AI to billions of people every day. Alphabet, the parent company, is publicly listed with a history of regulatory backlash. Its AI product - Gemini - answers questions like there’s a lawyer sitting in the room vetting every word. Anthropic (Claude AI) on the other hand has a completely different “feel” to it. Maybe because they don’t have to look at “how will our stock react to this?” X.ai is filling the void for more human or outrageous responses to capture its own niche.

I don’t think any of these approaches are wrong. But the pattern I see is roughly: the bigger the company, the more public the exposure, the more cautious, and as a result the blander the AI it ships. If you’re inside a regulated retailer trying to ship an AI feature, your AI is likely to be the most timid one in the room before anyone has even written a requirement.

The loud and silent failures

If your AI is too cautious you might fail the customer who asked a normal question, got a non-answer, and walked away without a solution to their issue. Next time they need something, they ring the contact centre - which is the exact thing the AI was meant to deflect from.

If it’s too confident, you might have failed the regulator. The AI quotes an outdated tariff, or uses “the wrong word” that someone found offensive and you have a complaint on your hand. The regulators don’t care that the model usually gets it right - they care about one wrong answer to a vulnerable customer.

One of those failures is loud, the other is completely silent. Confident-but-wrong triggers a complaint. Overcautious produces a customer who decides the tool doesn’t work - usually leaving no signal. So companies chase the loud failure, because they can see it while the silent one just costs them customers.

Why regulated industries get this wrong by default

The people building AI features in regulated B2C - in my experience, anyway - usually didn’t come from AI. They calibrate the AI against the worst thing the company has ever been fined for, not against the worst thing the customer has ever experienced.

I can see that from their perspective. It’s a perfectly rational response to incentives. The career risk of “we shipped an AI that gave bad advice” is enormous. The career risk of “we shipped an AI nobody finds useful” is roughly zero (well, except for time and resources spent).

So the risk is that you end up with such a cautious AI product that basically turns into a search box that talks, that customers don’t use (there’s already so many search boxes available in the market).

The handover is where the product lives

I think the job is to be genuinely helpful to the customer, inside a clear set of rules that don’t move. Which sounds obvious until you realise most regulated AI features get built without anyone asking what “genuinely helpful” actually looks like in the moment the AI can’t directly answer.

Designing what happens when AI cannot be helpful because “the regulators said you can’t” is product work, which very often gets skipped. At least for now. The Woshipm piece called this the soul question, but I think it’s the “empathetic human touch”: if your AI is refusing things a competent customer service agent would happily answer, you don’t have an AI safety problem, you have a product problem.

Overcautious AI in a chat window is a broken experience just like the equivalent of saying your account number into an automated phone system that mishears half the digits, asks you to repeat, mishears them again, then comes back with “sorry I didn’t get that” The customer rang because they needed help, and instead they got friction stacked on more friction.

Going back to the chatbot I mentioned at the start of this article. The right reply to “am I on the cheapest plan” wasn’t a silent refusal wrapped in “please contact our customer service team.” It could be something closer to: “I can see you’re trying to compare two tariffs. I’m not the right tool to advise on a switch, but I’ve pre-filled the comparison form and an advisor will call you back within the hour.” What happens behind the scenes is the FUN part. That reply catches the customer’s intent, quietly automates the time-consuming and tedious bits - analyses customer’s usage, looks at usage patterns, searches for genuinely useful options for the customer and hands it over to an agent who handles the callback. That is the product.

Have you ever had that feeling when you were stuck on a call with someone that couldn’t help you but the “let me transfer you to someone who can help” genuinely solves your issue within 3 minutes? And you didn’t have to explain everything from scratch. That transfer is the equivalent of a perfect handover, in practice.

The thing I keep coming back to

Your customer-facing AI is what customer will interact with. They won’t be thinking “is this ticking regulatory boxes?” They grade it on whether it answered the question, or got them closer to the answer. If it cannot solve customer’s problem they ring up - and your AI investment turns into a cost line with nothing on the other side, while the contact centre quietly carries the load it was meant to relieve.

The regulator isn’t grading you on customer satisfaction that ultimately brings in more customers for the business.

I keep thinking about that chatbot that wouldn’t tell me whether I was on the cheapest plan. Someone built that and I’d quite like to read their requirements document and the challenges they were facing to understand this better.

Artur’s Substack

Discussion about this post

Ready for more?