OpenAI has announced that it is granting third-party developers access to its speech-to-speech engine, which powers ChatGPT's advanced voice mode. This move is expected to lead to the development of AI apps with conversational voice interfaces. Early testers of the feature include Healthify, a nutrition and fitness app, and Speak, a language learning app. Developers will also be able to fine-tune models based on pictures, in addition to the speech-to-speech engine. OpenAI demonstrated the new audio capabilities by having an AI assistant call a fictional candy shop and place an order using Twilio's API. The voices available to developers will be the same ones offered within ChatGPT. However, developers will not be able to use custom voices. OpenAI's terms of service prohibit using its systems to spam or mislead people. The company does not require developers to watermark the voice or identify the AI system. This new feature is expected to enable the creation of more advanced AI-powered voice interfaces.
developers.slashdot.org
developers.slashdot.org
