OpenAI's Protein Engineering Model: Advancing Longevity Research Through AI

OpenAI is making strides in scientific research, particularly in the field of protein engineering. The company has developed a language model aimed at creating new proteins that can transform normal cells into stem cells. This initiative marks OpenAI’s first venture into using biological data and claims to produce new scientific results.

The project began when Retro Biosciences, a company focused on longevity research, partnered with OpenAI. Sam Altman, CEO of OpenAI, has invested $180 million in Retro, which aims to extend human lifespan by at least ten years. They are researching Yamanaka factors, proteins that can reprogram human skin cells into youthful stem cells capable of producing any tissue in the body.

Reprogramming cells for longevity is seen as a potential starting point for rejuvenating livestock, building human organs, or providing replacement cells. However, this process is inefficient, taking weeks with less than one percent of treated cells completing the transformation.

OpenAI’s new model, GPT-4b micro, was trained to enhance the functionality of protein factors. Researchers used the model’s suggestions to modify two Yamanaka factors, making them over 50 times more effective based on preliminary measurements. The model’s success suggests it can outperform human scientists in this area.

John Hallman, an OpenAI researcher, noted that while the model is promising, its results need to be verified by external scientists. The model is currently a custom demonstrator and not yet available for broader use.

The model differs from Google’s Alphafold, which predicts protein shapes. Yamanaka factors are unstructured proteins, requiring a different approach. OpenAI’s model was trained on protein sequences from various species and information about protein interactions, making it a “small language model” with a targeted dataset.

Retro scientists used the model to propose redesigns for Yamanaka proteins, employing a prompting technique similar to the “few-shot” method used in chatbots. Although genetic engineers can guide molecule evolution in the lab, they can only test a limited number of possibilities. Proteins can be altered in countless ways, as they consist of hundreds of amino acids, each with 20 possible variants.

The model often suggests changes to a third of the amino acids in proteins. Joe Betts-Lacroix, CEO of Retro, stated that the model’s ideas led to improvements in Yamanaka factors. Vadim Gladyshev, a longevity researcher at Harvard University, emphasized the need for better stem cell production methods.

Understanding how GPT-4b makes its predictions remains unclear, a common challenge with AI models. Betts-Lacroix compared it to Alphago, which beat the best human Go players but took time to understand why. OpenAI is still exploring the model’s full potential.

No money exchanged hands in the OpenAI-Retro collaboration, but the partnership could attract criticism due to Altman’s involvement. Altman’s investments in private tech startups have raised concerns about potential conflicts of interest, as some companies collaborate with OpenAI.

Retro’s connection to Altman, OpenAI, and the pursuit of AGI could enhance its profile, aiding in recruitment and fundraising. Betts-Lacroix did not comment on Retro’s current fundraising status. OpenAI emphasizes that Altman is not directly involved in the work and that the company does not base decisions on his other investments.