Google’s Whisk AI Tool That Responds to Image Prompts

Google Whisk is a new artificial intelligence (AI) tool for generating images using the uploaded photos, where no text prompts are required. It is designed as a creative exploration tool rather than a professional-grade image editor. It offers quick, imaginative visual outputs by combining elements from user-uploaded images.

AI Whisk works on a specialized technology, which is powered by two major AI platforms:

Google’s Gemini AI platform (launched Dec 2023).
DeepMind’s Imagen 3, the latest text-to-image generator.

These platforms operate on the process of:

Uploading images and captions using Gemini.
Imagen 3 using these captions to generate new visuals.

The results obtained can show noticeable differences in appearance, e.g., height, hairstyle, or skin tone changes, as it focuses on capturing the essence of the input rather than duplicating it.

The new AI tool introduced has its functionalities intrigued, enabling users to:

Merge subjects, settings, and styles from multiple photos into a single AI-generated image.
Remix the final image by adjusting inputs and mixing visual categories.
Use output styles include formats like plush toys, enamel pins, and stickers.
Add text prompts for more control, though it is optional.

Therefore, according to Thomas Iljic, director of product management at Google Labs, with these functionalities, SEO experts and content creators can utilize Whisk AI to bring new visual ideas to life, enriching their content marketing efforts.

As the technology is still in its early stages of development, and is now accessible in the United States through Google Labs. Also, despite its potential, Google continues to address concerns about generative AI, such as:

Image accuracy.
Public trust,

Which were questioned, especially after criticism over historically inaccurate graphics with earlier Gemini text-to-image capabilities in 2023.

Even with all these drawbacks and criticisms faced, competition is fuled by the Open AI recent introduction of new product, text-to-video generator “Sora”.

Lastly, Analyst Perspectives on AI Whisk brings a spur to the innovation by

Analysts, including Dan Ives of Wedbush Securities, view Whisk AI as a calculated move to showcase Google’s strength in the AI race.
Ives called it another flex-the-muscles moment for Google.
Industry experts highlight it as proof of Google’s deep financial commitment to driving innovation in generative AI technologies.

Source: edition.cnn.com