Nvidia has introduced a new platform called Cosmos for world foundational models. This platform aims to develop AI applications that understand physics, which can be used in robotics and autonomous vehicles. Additionally, Nvidia announced Nemotron, a family of language models designed to create AI agents for businesses. These agents can be used in customer support, fraud detection, or managing supply chains and inventories.
Developing AI models with an understanding of physics requires a large volume of training data. With Cosmos, developers can input text, images, videos, as well as sensor and motion data to receive physically accurate training videos, which can replace real-world tests. Nvidia Omniverse-developed 3D scenarios can also be converted into videos. The company claims that Cosmos, using Nvidia Blackwell, can process 20 million hours of video material within two weeks.
Besides the large language model Llama Nemotron, the model family also includes Cosmos Nemotron, a visual language model that leverages the recognition and analysis capabilities of the world model. Together, these language models can be used for business applications. For instance, warehouses can use cameras to capture current inventory. An AI application analyzes the images and compares the captured goods with stored records.
Nvidia also introduced new blueprints for AI agents, developed in collaboration with partners. These templates cover functions for commonly used purposes, allowing developers to create business-specific AI applications without having to build their foundational functions. With these blueprints, AI agents can be created to comment on code, structure repositories, or perform automated web searches.
One Nvidia-developed template involves converting PDF content into podcasts. An AI agent developed with this template can summarize texts, tables, and images from PDF files and present them to users as a monologue or conversation. Nvidia promises that this allows users to learn information more efficiently and at their own pace. Developers can run these blueprints with pre-configured settings on devices, in data centers, or in the cloud.
Nvidia offers the AI models in three levels ranging from 4 to 14 billion parameters. The smallest level, Nano, is intended for PCs and other devices, while the Ultra level targets usage in data centers. Cosmos and Nemotron models are available under the Open-Model License, which allows commercial use but are not open-source like NVLM. Companies will also receive the models through the Nvidia AI-Enterprise platform and as part of the NIM Microservices. Some of these are already available as previews.