Meta genai org in panic mode
Short: Deepseek is a Chinese company that has an AI model, according to one Hacker News commenter:
The DeepSeek folks just showed the world how to do the same thing those teams do, but at ~99% lower cost -- and published all code and weights as free open-source.
Post: https://www.teamblind.com/post/Meta-genai-org-in-panic-mode-KccnF41n
It started with deepseek v3, which rendered the Llama 4 already behind in benchmarks. Adding insult to injury was the "unknown Chinese company with 5..5 million training budget"
Engineers are moving frantically to dissect deepseek and copy anything and everything we can from it. I'm not even exaggerating.
More about deepseek
https://en.wikipedia.org/wiki/DeepSeek
Github (open source)
https://github.com/deepseek-ai/DeepSeek-V3
DeepSeek claims its reasoning model beats OpenAIs o1 on certain benchmarks
https://techcrunch.com/2025/01/20/deepseek-claims-its-reasoning-model-beats-openais-o1-on-certain-benchmarks/
DeepSeeks new AI model appears to be one of the best open challengers yet
https://techcrunch.com/2024/12/26/deepseeks-new-ai-model-appears-to-be-one-of-the-best-open-challengers-yet/
The model, DeepSeek V3, was developed by the AI firm DeepSeek and was released on Wednesday under a permissive license that allows developers to download and modify it for most applications, including commercial ones.
DeepSeek V3 can handle a range of text-based workloads and tasks, like coding, translating, and writing essays and emails from a descriptive prompt.
According to DeepSeeks internal benchmark testing, DeepSeek V3 outperforms both downloadable, openly available models and closed AI models that can only be accessed through an API. In a subset of coding competitions hosted on Codeforces, a platform for programming contests, DeepSeek outperforms other models, including Metas Llama 3.1 405B, OpenAIs GPT-4o, and Alibabas Qwen 2.5 72B.
DeepSeek V3 also crushes the competition on Aider Polyglot, a test designed to measure, among other things, whether a model can successfully write new code that integrates into existing code.