Nvidia, the world’s most valuable company and leader in chips used for artificial intelligence, has developed a new AI model called Fugatto. This model can create music and other audio sounds from a text prompt. It can also modify existing audio recordings and generate entirely new sounds. This innovation is likely to attract significant commercial interest, although Nvidia has stated there are no immediate plans to release the technology.
Fugatto stands out from other audio AI models due to its ability to transform audio in unique ways. For example, it can convert a melody played on a piano into one sung by a human voice. It can also modify a voice recording to change the accent or the expressed emotion. This capability highlights the potential for new creative possibilities in music, video games, and other areas where people want to create something unique.
The model was trained using open-source data, and Nvidia is still considering how best to make it accessible. According to Bryan Catanzaro, Nvidia’s Vice President for Applied Deep Learning Research, every generative technology carries risks, as it could be used to produce undesirable content. Therefore, Nvidia is cautious and has no immediate plans to release Fugatto.
Fugatto’s ability to alter voices raises concerns about potential misuse, such as creating deepfakes. The discussion around this issue is ongoing. For instance, Scarlett Johansson accused OpenAI of artificially imitating her voice. OpenAI is currently negotiating with Hollywood studios about the use of AI in the entertainment industry, but artists have expressed concerns.
Neither OpenAI nor Meta has announced when they plan to make their models, which can generate audio or video, available to the public. The potential for misuse, particularly in creating deepfakes, is a significant concern that these companies are likely considering.
While the technology holds exciting possibilities, the ethical implications and potential for misuse are important factors that companies like Nvidia, OpenAI, and Meta must address before releasing such powerful tools to the public.