Name: DeepSeek-R1
Author: DeepSeek

DeepSeek's flagship reasoning model released January 2025 with 671B parameters using Mixture of Experts architecture (37B activated per token). Trained via large-scale reinforcement learning without supervised fine-tuning as preliminary step. Demonstrates self-verification, reflection, and long chain-of-thought reasoning capabilities comparable to OpenAI o1 across math, code, and reasoning tasks. Achieves 77.9% on AIME 2024 and 97.3% on MATH-500. Includes distilled models from 1.5B to 70B based on Qwen2.5 and Llama3. Released under MIT license making reasoning capabilities widely accessible. Maximum output length 20K tokens. Represents breakthrough in open-source reasoning models.

DeepSeek-R1

Strengths

Caveats

Capabilities

Resources

Reviews

Comments