DeepSeek's open-source AI model DeepSeek-R1 rivals OpenAI's o1

Hey everyone, just wanted to share some exciting news! DeepSeek, a Chinese AI lab, has launched its latest open-source reasoning models, DeepSeek-R1 and DeepSeek-R1 Zero. The goal here is to make advanced reasoning more accessible, and they’re claiming that DeepSeek-R1 matches, if not beats, OpenAI’s o1 in performance.

Built on their earlier DeepSeek V3 model, they’re saying R1 surpasses o1 on some pretty tough benchmarks like AIME, MATH-500, and SWE-bench verified.

Here’s a breakdown of the performance according to the verified benchmarks:

  • 79.8% on the 2024 American Invitational Mathematics Examination (AIME), slightly ahead of o1’s 79.2%.

  • 97.3% success rate on MATH-500, beating o1’s 96.4%.

  • A Codeforces rating of 2,029, outperforming 96.3% of human programmers, just a bit behind o1’s 96.6%.

In general knowledge, R1 scored 90.8% on the MMLU benchmark, which is very close to o1’s 91.8%.

The model is available on HuggingFace under an MIT license, so it’s free to use even for commercial stuff.

R1 can fact-check itself, which is awesome because it cuts down on common errors you see in models that don’t reason as well. Plus, it has a whopping 671 billion parameters, which is a huge deal when it comes to advanced problem-solving.

DeepSeek has also released smaller, “distilled” versions of R1 based on Llama and Qwen. These models range from 1.5 billion to 70 billion parameters, meaning even the smallest one can run on your laptop. You can also find these on HuggingFace. The full R1 needs some serious hardware, but you can also access it via DeepSeek’s API at a cost that’s 90% to 95% lower than OpenAI’s o1.

Here’s something cool: DeepSeek-R1 uses reinforcement learning along with supervised fine-tuning, unlike o1, which uses chain-of-thought. That’s how they’re able to offer it at such a lower price.

Keep in mind, though, since R1 is developed in China, it is regulated by their internet authorities to adhere to their values. So it might not answer questions on some topics, like Tiananmen Square or Taiwan’s autonomy.

Despite these limitations, DeepSeek-R1 brings us closer to the goal of achieving artificial general intelligence (AGI) by bridging the gap between closed and open-source models.

DeepSeek-R1 isn’t just a tech achievement; it highlights the power of open-source AI, especially when most AI is proprietary. By offering a high-performance model with different accessibility options, DeepSeek empowers developers to get involved in the AI revolution.

As we continue on the road to AGI, models like DeepSeek-R1 prove that collaboration and open access are incredibly important to the future of technology.