Small model, big performance: The Large Language Model Mistral 7B

The Large Language Model (LLM) Mistral 7B is a product of the French AI start-up Mistral AI. Despite its relatively small size with 7.3 billion parameters, it was able to beat larger models in a series of tests. The Completely Open Source model is licensed under the Apache 2.0 license and received more attention in the AI scene after the company received an additional $113 million in funding in June. Notable investors, including former Google CEO Eric Schmidt and venture capitalist Innovation Endeavors, are backing the project.

The model’s architecture enables a variety of applications. Although Mistral 7B is a comparatively small model, it features high speed, making it ideal for text summarization, classification, and text and code completion. In addition, Mistral 7B can be customized for a user’s specific application conditions.

Mistral AI developers claim that 7B can outperform the Llama 2 13B model, which is twice its size, in all benchmarks and the Llama 1 34B in some metrics. It is even on a similar level as the Llama 1 34B model.

Despite the model being openly accessible, the details of the training procedure or the dataset used are not clear. The model is not moderated and therefore does not follow any guidelines about what output the model generates for the user or what topics it avoids, which again contradicts the ChatGPT approach.

The debut of Mistral 7B was a notable success for Mistral AI in the world of artificial intelligence. Not only the performance, but also the openness and application of innovative techniques of the model have attracted attention. The developers have announced plans to develop larger models with enhanced capabilities and broader language support. X user abhi1thakur has published an exemplary fine-tuning routine on Google Colab. Mistral models are available on Hugging Face, while more information is provided in the model blog post. The journey of Mistral AI and 7B has just begun and it remains to be seen what further developments will be seen in this field.