Exploring Deepseek's Smaller AI Models and Their Capabilities

The Deepseek R1 AI model has created quite a stir, not only because it works more efficiently than comparable models but also because the Chinese startup released it under an open-source license. This means anyone can theoretically use the model. However, in practice, not many individuals have the hardware to run the model at a reasonable speed.

Deepseek also released a series of smaller models alongside the R1 model. The company aims to demonstrate how “reasoning,” the model’s ability to draw conclusions from given information, can be transferred to smaller models. These smaller models are not variants of R1. Instead, Deepseek used a process called distillation, where the capabilities of a larger, complex model are transferred to a simpler model.

In the case of the smaller Deepseek models, R1 was used to transfer capabilities to different-sized model variants of Meta’s Llama 3.1 and Alibaba’s Qwen2.5-Math, optimized for math tasks. However, this method has its limits. A model with seven billion parameters will not deliver the same quality as a model with over 600 billion parameters. Despite efficiency improvements, the rule “more is better” still applies.

With tools like LM Studio, you can easily run the small Deepseek models locally on your computer. There are various methods to execute language models on a home computer. A popular tool is Ollama, a command-line tool without a graphical interface. While perfect for developers wanting to process AI-generated data directly, we suggest a simpler method for those without terminal experience.

LM Studio is available for free on Macs with Apple Silicon, Windows, and Linux. After installation, LM Studio will ask which language model you want to download. Deepseek has released six distilled model variants. Your hardware might not be strong enough for the larger ones, but LM Studio will warn you if that’s the case.

We focus on the medium model variants DeepSeek-R1-Distill-Qwen-7B and DeepSeek-R1-Distill-Llama-8B. You can find them through the model search in LM Studio, accessible via the magnifying glass icon in the right navigation bar.

It is impressive how these models demonstrate their reasoning abilities with moderate CPU and memory usage on a MacBook Air with an M1 chip. In LM Studio, this process is hidden by default. To view it, click on the “Thoughts” field or go to “Appearance” and activate the “Expand reasoning blocks by default” slider.

While Deepseek’s website and app comply with Chinese censorship rules, you don’t have to worry about this with self-hosted Deepseek models. Both DeepSeek-R1-Distill-Qwen-7B and DeepSeek-R1-Distill-Llama-8B correctly answer questions about the Tian’anmen Square incident.

Both models reliably handle math and coding tasks, although their results don’t match the quality of large cloud models. This is also evident when dealing with different languages. You can write prompts in German, but sometimes English mixes into the responses.

Due to the models’ small size, quality compromises are still necessary, and this won’t change soon. However, the models show progress in this area. Notably, the DeepSeek-R1-Distill-Qwen-7B surpasses OpenAI’s o1-mini in some math benchmarks.

Exploring Deepseek’s Smaller AI Models and Their Capabilities

Related Posts