Meta had a comeback - arguably not opensource, but still - but Deepseek just seems to have vanished from the scene. What happened? Will we ever see Deepseek V4?
Source: https://x.com/i/status/2041458478569689589
Tested Gemma 4 (31B) on our benchmark. Genuinely did not expect this. 100% survival, 5 out of 5 runs profitable, +1,144% median ROI. At $0.20 per run. It outperforms GPT-5.2 ($4.43/run), Gemini 3 Pro ($2.95/run), Sonnet 4.6 ($7.90/run), and absolutely destroys every Chinese open-source model we've tested — Qwen 3.5 397B, Qwen 3.5 9B, DeepSeek V3.2, GLM-5. None of them even survive consistently. The only model that beats Gemma 4 is Opus 4.6 at $36 per run. That's 180× more expensive. 31 billion parameters. Twenty cents. We double-checked the config, the prompt, the model ID — everything is identical to every other model on the leaderboard. Same seed, same tools, same simulation. It's just this good. Strongly recommend trying it for your agentic workflows. We've tested 22 models so far and this is by far the best cost-to-performance ratio we've ever seen. Full breakdown with charts and day-by-day analysis: [foodtruckbench.com/blog/gemma-4-31b](https://foodtruckbench.com/blog/gemma-4-31b) *FoodTruck Bench is an AI business simulation benchmark — the agent runs a food truck for 30 days, making decisions about location, menu, pricing, staff, and inventory. Leaderboard at* [*foodtruckbench.com*](https://foodtruckbench.com) **EDIT — Gemma 4 26B A4B results are in.** Lots of you asked about the 26B A4B variant. Ran 5 simulations, here's the honest picture: **60% survival** (3/5 completed, 2 bankrupt). Median ROI: +119%, Net Worth: $4,386. Cost: $0.31/run. Placed #7 on the leaderboard — above every Chinese model and Sonnet 4.5, below everything else. Both bankruptcies were loan defaults — same pattern we see across models. The 3 surviving runs were solid, especially the best one at +296% ROI. **But here's the catch.** The 26B A4B is the only model out of 23 tested that required custom output sanitization to function. It produces valid tool-call intent, but the JSON formatting is consistently broken — malformed quotes, trailing garbage tokens, invalid escapes. I had to build a 3-stage sanitizer specifically for this model. No other model needed anything like this. The business decisions themselves are unmodified — the sanitizer only fixes JSON formatting, not strategy. But if you're planning to use this model in agentic workflows, be prepared to handle its output format. It does not produce clean function calls out of the box. **TL;DR:** 31B dense → 100% survival, $0.20/run, #3 overall. 26B A4B → 60% survival, $0.31/run, #7 overall, but requires custom output parsing. The 31B is the clear winner. Updated leaderboard: foodtruckbench.com
I'm mind blown by the fact that about a year ago DeepSeek R1 came out with a MoE architecture at 671B parameters and today Gemma 4 MoE is only 26B and is genuinely impressive. It's 25 times smaller, but is it 25 times worse? I'm exited about the future of local LLMs.
[Translated by Nano Banana ](https://preview.redd.it/cgcrj6z2n6rg1.png?width=1138&format=png&auto=webp&s=9062bd60f8870f53efae287e94d9d3d198e452e9) https://preview.redd.it/8bfh5zk1q6rg1.png?width=1158&format=png&auto=webp&s=9d8e6c2f285ba04527f0e9578f9ca7b75124c11f https://preview.redd.it/jpa7aikcr6rg1.png?width=688&format=png&auto=webp&s=2a35594f8ff5eb5f2cd18ad2f4de6662f2898b1d **Note: The employee just deleted his reply; it seems he said something he shouldn't have.** **Original post:** [**http://xhslink.com/o/3ct3YOygvNN**](http://xhslink.com/o/3ct3YOygvNN)
Recently, heavy-hitting news regarding a major personnel change has emerged in the field of Large Language Models (LLMs): **Daya Guo**, a core researcher at DeepSeek and one of the primary authors of the DeepSeek-R1 paper, has reportedly resigned. Public records show that Daya Guo possesses an exceptionally distinguished academic background. He obtained his PhD from Sun Yat-sen University in 2023, where he was mentored by Professor Jian Yin and co-trained by Ming Zhou, the former Deputy Dean of Microsoft Research Asia (MSRA). Daya Guo officially joined DeepSeek in July 2024, focusing his research on Code Intelligence and the reasoning capabilities of Large Language Models. During his tenure at DeepSeek, Guo demonstrated remarkable scientific talent and was deeply involved in several of the company’s milestone projects, including **DeepSeekMath**, **DeepSeek-V3**, and the globally acclaimed **DeepSeek-R1**. Notably, the research findings related to DeepSeek-R1 successfully graced the cover of the top international scientific journal **Nature** in 2025, with Daya Guo serving as one of the core authors of the paper. Regarding his next destination, several versions are currently circulating within the industry. Some reports suggest he has joined Baidu, while other rumors indicate he has chosen ByteDance. As of now, neither the relevant companies nor Daya Guo himself have issued an official response. External observers generally speculate that the loss of such core talent may be related to the intense "talent war" and competitive compensation packages within the LLM sector. As the global AI race reaches a fever pitch, leading internet giants are offering highly lucrative salaries and resource packages to secure top-tier talent with proven practical experience. Insiders point to two primary factors driving Guo’s departure: 1. **Computing Resources**: Despite DeepSeek's efficiency, the sheer volume of computing power available at the largest tech giants remains a significant draw for researchers pushing the boundaries of LLM reasoning. 2. **Compensation Issues**: Reports indicate a "salary inversion" within the company, where newer hires were reportedly receiving higher compensation packages than established core members. The departure may not be an isolated incident. Rumors are circulating that other "important figures" within DeepSeek are currently in talks with major tech firms, seeking roles with larger "scope" and better resources. As the global AI race reaches a fever pitch, the ability of "AI unicorns" to retain top-tier talent against the massive resources of established internet giants is facing its toughest test yet. Source from some Chinese news: [https://www.zhihu.com/pin/2018475381884200731](https://www.zhihu.com/pin/2018475381884200731) [https://news.futunn.com/hk/post/70411035?level=1&data\_ticket=1771727651415532](https://news.futunn.com/hk/post/70411035?level=1&data_ticket=1771727651415532) [https://www.jiqizhixin.com/articles/2026-03-21-2](https://www.jiqizhixin.com/articles/2026-03-21-2) [https://www.xiaohongshu.com/discovery/item/69bd211c00000000230111fb?source=webshare&xhsshare=pc\_web&xsec\_token=CBbUil7jGmHR\_sMr3sM56dYn9utmWYYN11mYMfe6FL0Cw=&xsec\_source=pc\_share](https://www.xiaohongshu.com/discovery/item/69bd211c00000000230111fb?source=webshare&xhsshare=pc_web&xsec_token=CBbUil7jGmHR_sMr3sM56dYn9utmWYYN11mYMfe6FL0Cw=&xsec_source=pc_share)