DeepSeek slashes V4-Pro pricing, igniting pressure on AI costs

AIMay 24, 2026 at 05:30 PM

Read full story

Source: Digital Trends

TLDR: BEIJING—DeepSeek permanently cut its V4-Pro AI model price by 75%, dropping token costs to 0.025 to 6 yuan per million tokens. Developers may see far lower inference bills, and the market now watches Huawei Ascend chips and chip access constraints.

Key Takeaways:

DeepSeek faced high inference costs tied to limited access to top compute, especially compared with its cheaper Flash model.
V4-Pro now costs 0.025 to 6 yuan per million tokens, down from 0.1 to 24 yuan, depending on workload type.
If Huawei Ascend 950 supply improves under U.S. export limits, global providers could face a faster, harsher AI price war.

DeepSeek’s 75% cut reads like a quiet victory over bottlenecks, not just a promo. If Huawei Ascend delivers enough compute at scale, pricing could stop being a premium feature and start acting like a race to the bottom.

No comments yet. Be the first to share your thoughts!

Q&A

If inference suddenly gets cheaper, what happens to who can afford to build AI products?

Lower per token costs can shift advantage toward smaller teams that run more experiments, but only if reliability and latency stay stable at higher usage volumes.

Will DeepSeek’s pricing cut force others to match it, or will they protect margins with different model tiers?

They may split offerings by capability tiers, but the fastest path to competitive pressure is matching per token pricing for comparable quality.

Why does chip supply matter even after DeepSeek changes prices?

Pricing reflects both unit hardware cost and capacity planning, so demand spikes can expose bottlenecks if compute supply does not scale alongside usage.

How might Huawei Ascend 950 capacity change the balance between Chinese AI firms and global leaders?

More dependable local compute can let Chinese providers scale inference and iterate faster, raising competitive expectations for cost and throughput.

What is the next likely move after a permanent price slash?

Expect stronger competition on sustained availability, context length, and tool use, since cheaper inference pushes buyers to benchmark total system performance, not just headline model pricing.