“`html
Language Models and AI Development
Large Language Models (LLMs) and current generative AI models are based on an architecture that has been known for a long time. Improvements have mainly come through scaling, which means adding more data. Some AI experts hoped that more data could eventually lead to Artificial General Intelligence (AGI). However, even among supporters of scaling, there is growing concern that achieving AGI might not be so simple.
Insiders have often had good insights into what is happening at OpenAI. Recently, reports indicate that OpenAI’s next AI model, Orion, does not show a significant improvement. Orion was often seen as an important step toward AGI. Similarly, Anthropic has indefinitely postponed the release of its planned most powerful model. Claude Opus 3.5 was expected to be released this year, but this information was missing in recent announcements of other models.
According to Reuters, Ilya Sutskever, a former founder of OpenAI, believes a plateau has been reached: “The 2010s were the years of scaling.” Now, we are in an era of wonder and discovery. It is more important than ever to scale in the right areas. Sutskever left OpenAI this summer to start his own venture, founding the AI lab Safe Superintelligence.
What Comes After Scaling?
Some scientists believe the problem lies in the lack of available training data, while others think scaling is not the right approach. There is also debate about whether synthetic data, which is generated by AI models themselves, is a useful addition to training data. Without sufficient processing, some studies suggest a so-called model collapse could occur. This means that training data becomes too similar, leading to a lack of meaningful output.
Other approaches include model distillation, where the knowledge of large models is transferred to smaller models. This process aims to make AI more cost-effective and resource-efficient. At the same time, major AI providers are focusing on specializing their models for specific tasks. This includes the development of AI agents that can autonomously handle these tasks.
“`