Google's Whisk: Simplifying AI Image Creation with Combined Elements

Google has introduced a new AI tool called Whisk, which allows users to create images by combining multiple pictures. This tool simplifies the image generation process by eliminating the need for lengthy prompts. Whisk, which means “to whisk” or “to mix” in English, lets users upload images or provide prompts to define three main elements: the subject, the scene, and the style of the image.

To use Google Whisk, users outside certain countries may need to use a VPN. Initially, the tool offers a tutorial where users can upload an image to create a plush toy. For example, using the t3n logo, Whisk generated a bright red plush toy with button eyes. After this initial step, users can access the editor to explore all of Whisk’s features.

Whisk allows users to customize images by adjusting the subject, scene, and style. These elements can be modified, downloaded, or regenerated if the results are unsatisfactory. It’s important to note that all inputs must be in English, even if the interface appears in another language.

In one instance, we decided to place a plush toy in an office setting. The AI initially generated an image of a businessman at a desk, but by replacing the businessman with the plush toy, we achieved the desired result. The style was left to Whisk’s discretion, but for a seasonal touch, we described a scene with a Santa Claus in front of a Christmas tree, surrounded by cookies, milk, and gifts. The style chosen was an 80s comic style, featuring bright colors and bold outlines.

While experimenting with Whisk, some issues were noted. As the tool is still in its alpha phase, these problems are expected. One major issue is the consistency between images, as the plush toy’s appearance changed significantly over a short period. Additionally, Whisk struggled with text generation after multiple prompts, likely due to the increasing complexity of inputs. However, restarting the process or reminding the AI of the correct spelling improved results.

Other typical AI issues include extra or merged limbs in human images and misplaced objects. For instance, a slice of pizza appeared in a Christmas scene. Using the optimization feature, we added a Santa hat and corrected the t3n text.

Despite these challenges, Whisk offers an intriguing and enjoyable concept. Users can achieve good results quickly, even without prompt expertise. Those who experiment with Whisk’s prompts can obtain more precise outcomes. It’s hoped that Whisk will become more accessible to users worldwide without requiring workarounds like VPNs.

Overall, Google Whisk is an exciting development in AI image generation, offering a fun and creative way to produce unique images by blending different elements.

Google’s Whisk: Simplifying AI Image Creation with Combined Elements

Related