Gate News message, April 25 — DeepSeek released preview versions of V4-Pro and V4-Flash on April 24, both open-weight models with one million token context windows. V4-Pro features 1.6 trillion total parameters but activates only 49 billion per inference pass using a Mixture-of-Experts architecture. V4-Flash has 284 billion total parameters with 13 billion active.
Pricing is significantly lower than competitors: V4-Pro costs $1.74 per million input tokens and $3.48 per million output tokens—approximately 98% less than OpenAI's GPT-5.5 Pro ($30 input, $180 output) and roughly one-twentieth the cost of Claude Opus 4.7. V4-Flash is priced at $0.14 input and $0.28 output per million tokens. Both models are open-source under MIT license and can run locally for free.
DeepSeek achieved efficiency gains through two new attention mechanisms: Compressed Sparse Attention and Heavily Compressed Attention, which reduce compute costs to 27% of V4-Pro's predecessor (V3.2) and 10% for V4-Flash. The company trained V4 partly on Huawei Ascend chips, circumventing U.S. export restrictions on advanced Nvidia processors. DeepSeek stated that once 950 new supernodes come online later in 2026, pricing will drop further.
On performance benchmarks, V4-Pro-Max ranks first on Codeforces competitive programming (3,206 score, placing around 23rd among human contestants) and scores 90.2% on Apex Shortlist math problems versus Claude Opus 4.6's 85.9%. However, it trails on multitasking benchmarks: MMLU-Pro (87.5% vs Gemini-3.1-Pro's 91.0%) and Humanity's Last Exam (37.7% vs 44.4%). On long-context tasks, V4-Pro leads open-source models but loses to Claude Opus 4.6 on MRCR retrieval tests.
V4-Pro introduces "interleaved thinking," allowing agent workflows to retain reasoning context across multiple tool calls without flushing between steps. Both models support coding integrations with Claude Code and OpenCode. According to DeepSeek's developer survey of 85 users, 52% said V4-Pro was ready as their default coding agent, with 39% leaning toward adoption. The old deepseek-chat and deepseek-reasoner endpoints will retire on July 24, 2026.