AI Advancements: Emotion Recognition, Weather Forecasting, and Global Collaborations

Google’s PaliGemma 2 is a new AI model capable of recognizing emotions. This ability can be enhanced through fine-tuning. PaliGemma 2 processes both text and images, going beyond simple object recognition to describe actions, emotions, and the narrative of a scene. Because PaliGemma 2 is freely accessible, experts are concerned about the ease of access to emotion recognition, which is prohibited in many areas by European AI regulations. Exceptions are made for specific use cases. Facial and emotion recognition is prone to errors and suspected of bias, particularly against people with darker skin. Emotion recognition from voice is also possible but has limitations. It remains to be seen whether PaliGemma 2’s emotion recognition capability will lead to stricter scrutiny under AI regulations.

Google’s DeepMind has developed an AI-based weather forecasting model that outperforms current leading technology. The high-resolution model, GenCast, can predict weather for up to 15 days more accurately than the European Center for Medium-Range Weather Forecasts (ECMWF). This prediction only requires one of Google’s fifth-generation Tensor Processing Units (TPU). GenCast was trained with four decades of weather data from the ECMWF archive, learning “global weather patterns.” Tested on 2019 data, GenCast’s forecasts were better than the best ECMWF model in over 97% of cases, taking only about 8 minutes on a TPU compared to hours on a supercomputer. DeepMind plans to release real-time forecasts soon, which could integrate into other predictions.

Meta has released Llama 3.3, a new version of its Large Language Model. Llama 3.3 is designed to be simpler and more cost-efficient. In comparisons with Amazon’s Nova Pro, Google’s Gemini Pro 1.5, and OpenAI’s ChatGPT-4o, Llama 3.3 excelled in “Instruction Following” and “Long Context” categories. It also performed well on a multilingual math dataset, solving 91.1% of tasks. Some areas showed minor declines compared to predecessors, likely for operational and cost benefits. Mark Zuckerberg expects a tenfold increase in computing power needed for the upcoming Llama 4, anticipated next year.

OpenAI has announced twelve days of new releases, starting with a new, more expensive subscription, ChatGPT Pro, at $200 per month. This offers unlimited access to OpenAI’s latest AI model. OpenAI also introduced a new fine-tuning method called Reinforcement-Fine-Tuning (RFT), which allows models to develop new “thinking patterns.” Unlike previous methods, RFT involves problem-solving time, with successful processes reinforced and flawed ones weakened. This method is ideal for fields like law, finance, engineering, and insurance. OpenAI offers organizations the chance to participate in the RFT Research Program, providing access to the RFT API for feedback before public release. RFT will be generally available in early 2025.

OpenAI’s Chad Nelson has presented a new version of the Sora video generator, capable of creating videos up to one minute long. Sora 2 will offer three generation modes: text to video, text and image to video, and text and video to video. The release is imminent, with rumors of a more efficient and faster version confirmed by an API leak. Announcements are expected soon as part of OpenAI’s Winter Promotion, with GPT-4.5 also mentioned in recent data leaks.

The Canadian government is investing two billion Canadian dollars in national AI infrastructure. Up to 700 million dollars will go towards new data centers, one billion for public supercomputing infrastructure, and 300 million to help small and medium enterprises access computing power. Canada employs over 140,000 AI professionals and hosts ten percent of the world’s leading AI researchers. In 2022, over 8.6 billion dollars in venture capital went into Canada’s AI sector, 30% of the total venture capital investment. The new program is set to launch in spring 2025.

RTL Germany and Perplexity AI have formed a partnership. Initially, brands like ntv and stern will integrate into Perplexity’s search engine, with more RTL products to follow. AI applications will also feature on brand websites to improve content accessibility. RTL mentions “new business models” but provides no specifics. The partnership includes research and development projects, with RTL Germany staff accessing Perplexity’s Enterprise Pro Program for improved AI search capabilities. Financial details of the deal are undisclosed, and it’s unclear if Perplexity will prioritize partner content for other users.

Hesse’s police plan to expand AI use in video surveillance. A draft law proposes using intelligent image analysis software for public space surveillance. If motion patterns suggest a significant crime or weapon suspicion, police may mark suspicious individuals in video footage, subject to police officer review. The law also allows searching for missing persons, kidnapping or trafficking victims under certain conditions. This legislative proposal is under discussion in the state parliament and is expected to pass.

Related