First Open-Source AI Device, Apple's New Development, Voice Cloning — Top AI News of the Week
Our latest AI Digest covers the biggest breaking AI news for the week. Anywhere Club community leader, Aliaksei Kartynnik, comments on key stories.
#1 — New AI Device by Open Interpreter
— The creators of Open Interpreter have unveiled their new device, the O1 Light, which allows you to control your computer via an AI assistant using a voice interface and natural language. The company is known for its large open-source initiative enabling the local launch of various LLMs and interfacing with them. That initiative has grown into something like a full-scale OS running on top of the main OS. It can be used to manage documents and apps and even to write code. The O1 Light allows users to interact with Open Interpreter using voice commands. The device is available for pre-order only in the U.S., for $100, but all the blueprints are open-source, so radio enthusiasts might fancy trying to assemble it themselves. This may be the first open-source AI device on the market.
#2 — Apple Joins the Race Against OpenAI
— In a new scientific article, Apple researchers introduced ReALM, an artificial intelligence system capable of understanding and processing on-screen tasks, grasping the context of conversation, and detecting background processes. According to the article, ReALM demonstrates high accuracy on all types of datasets, outperforming the GPT-3.5 model, especially in newly-introduced tasks. Some data suggests that ReALM even outperforms GPT-4. In essence, this model could become a cornerstone for interfacing with future iOS updates. You simply say: "Siri, call the number from the business card photographed on this website," and then ReALM recognizes the number and handles the rest. It seems that Siri finally stands a fair chance of becoming a smart assistant.
#3 — OpenAI's New Model for Voice Cloning
— OpenAI previewed a preliminary version of Voice Engine, a model that can clone human voice from a 15-second audio sample and generate natural speech. Yes, voice cloning isn't new or surprising these days. We saw similar efforts from Meta and other companies in 2023. To this time, however, there have not been any public access tools that allow for high-quality voice cloning based on such a short sample. According to OpenAI, Voice Engine can retain the accent and emotions of the original speaker in the generated speech. OpenAI identifies several potentially beneficial applications of the technology, but for now, access to the model is limited to a select few. Representatives of HeyGen, which specializes in commercial production of video avatars and voice clones, are included in that select few. We’re eagerly awaiting new offerings from HeyGen!