Running Large Language Models Locally with LM Studio on Apple Silicon

Large language models don’t always have to be used on the servers of big companies like OpenAI (ChatGPT), Anthropic (Claude), or the recently introduced DeepSeek (R1). Simplified versions, created through a process called distillation from resource-intensive server models, can also run locally. On Macs, this is particularly easy with the free app LM Studio, designed for Apple Silicon machines.

Unlike more professional approaches like Ollama, LM Studio doesn’t require you to use the command line. However, if desired, you can run a local server, although it’s not necessary. The app integrates the use of language models into its own interface, combining model discovery, installation, and usage. All well-known open-source models are available, including Llama, DeepSeek, Qwen, and Mistral. Users can choose whether the models are optimized for Apple’s MLX format, which better utilizes the unified memory of Apple Silicon.

To run these models adequately, a computer with sufficient performance, RAM, and storage is needed. The models to be downloaded start at 4 GB, but can be 40 GB or larger. The output quality varies greatly, as shown in a brief test. Smaller models tend to hallucinate more than larger ones, with output speed varying. DeepSeek, even in its open-source version, includes censorship aligned with the Chinese government, so it may not discuss events like the Tiananmen Square massacre of 1989. However, there are modified models that bypass this restriction.

In a test with a large DeepSeek model, we received the best output by far. This was a distilled variant based on R1 with Llama 70B and quantization. It also shows the reasoning process (you must click on “Thinking”), indicating how the model arrives at its “thoughts.” The wait time was 20 to 40 seconds, during which the fan on our M3 machine often activated. This isn’t necessarily the case with smaller models—this one alone was 40 GB.

When experimenting with models, ensure there is enough SSD space, as these models can quickly fill up storage. The models are stored in the “models” directory under “.lmstudio” in the user’s home directory but can be easily managed and deleted through LM Studio’s GUI. The app is also available for sufficiently fast x86 Windows computers, ARM Windows machines, and Linux.

Overall, LM Studio provides an accessible way to experiment with large language models locally, offering flexibility and control over the models used. With the right hardware, users can explore a variety of open-source models and customize them to their needs, making it a valuable tool for those interested in natural language processing and artificial intelligence.

Related