Aleph Alpha’s New AI Architecture Enhances Language Model Efficiency and Adaptability

AI : Aleph Alpha's New AI Architecture Enhances Language Model Efficiency and Adaptability

Large Language Models (LLMs) can be fine-tuned to meet different needs, but this often yields unsatisfactory results when adapting to new languages or highly specialized industry knowledge. A startup from Heidelberg, Aleph Alpha, has developed a new AI architecture to address this issue. They are collaborating with AMD, SiloAI, and Schwarz Digits to enhance this development.

LLMs learn patterns during training based on a tokenized version of the texts used for training. The texts are broken down, and their structure is analyzed, from which probabilities are derived. Once training is complete, the LLMs can only be further adapted through fine-tuning. This process builds upon the existing LLM. Problems arise when new text during fine-tuning deviates significantly from the text the LLM was originally trained on. Aleph Alpha notes that such text “cannot be efficiently tokenized.”

Aleph Alpha proposes a new tokenizer-free architecture to solve this issue. This architecture is hierarchically arranged and combines processing at both character and word levels. According to a published paper, it “uses a lightweight encoder at the character level to convert character sequences into word embeddings, which are then processed by a backbone model at the word level and decoded back into characters by a compact decoder at the character level.”

This approach allows the creation of “sovereign models for various alphabets, less common languages, and highly specific industry knowledge,” according to Aleph Alpha. They describe this as a breakthrough. Previously, successful fine-tuning required a large amount of data. The new architecture is significantly more efficient, conserving computing power and resources. For many languages, there is insufficient data available to achieve good results using previous methods.

Aleph Alpha is collaborating with AMD and SiloAI. The Finnish startup SiloAI was acquired by AMD in the summer. According to a press release, “this new, innovative AI model architecture enables a reduction in training costs and CO₂ footprint by 70 percent for languages like Finnish compared to alternative options.” AMD states that the collaboration strengthens the European AI ecosystem.

The initial focus of this offering is on European authorities. Aleph Alpha has been targeting these customers for some time. The AI operating system for authorities is called Pharia. This initiative is additionally supported by data centers from Stackit, the cloud solution of Schwarz Digits. Schwarz Digits is the IT and digital division of the Schwarz Group, which includes Lidl and Kaufland.