OpenAI launched Advanced Voice Mode, Midjourney 6.1 is even more realistic, and open-source FLUX.1 is making a splash in media content generation — the top 3 AI news stories of the week
Our latest AI Digest covers the biggest breaking AI news of the week. Anywhere Club community leader, Viktar Shalenchanka, comments on key stories.
#1 — OpenAI finds its voice
OpenAI started granting access to GPT-4o’s Advanced Voice Mode to a limited group of ChatGPT Plus users. Access will expand to all Plus users in the fall. The officially launched version matches the demo precisely — it is capable of sighing, reproducing sounds, holding a conversation with ease, gauging the emotional state of its conversational partner, and considering those emotions when formulating its responses. A realistic and easily accessible voice mode could increase interest and engagement with this technology.
#2 — Midjourney 6.1 is even more realistic
Midjourney, a leader in AI image generation, has released a surprise version 6.1 — one of the most photorealistic image models. This version enhances the rendering of hands, bodies, plants and animals. It boosts image generation speed by 25% and improves the detail and accuracy of small background objects. Version 6.2 is expected in September.
#3 — New FLUX.1 competes with Midjourney 6 and Stable Diffusion XL
Newly-launched Black Forest Labs, founded by former employees of Stability AI, introduced FLUX.1, its open-source text-to-image suite of models for generating media content. FLUX.1 can be deployed locally. According to benchmarks, it is close to the level of Midjourney 6 and may outperform Stable Diffusion XL. Examples of generations can be seen in the announcement.