Microsoft Azure Speech introduces Dragon HD Omni, a new text-to-speech generation featuring over 700 expressive, multilingual voices. This unified model simplifies developer integration by addressing common issues like unnatural prosody and extensive SSML tuning. Dragon HD Omni offers enhanced contextual adaptation, preserving the unique character of each voice for more lifelike speech. It introduces nearly 300 new AI-generated voices with diverse options for gender, age, and tone, allowing for greater personalization and brand identity. The technology also enables automatic style prediction through natural language descriptions, offering advanced customization and broader style support. All Dragon HD Omni voices are multilingual, capable of automatically predicting and generating output in different languages and accents. The service supports word boundary events, crucial for applications requiring precise word-level synchronization. Developers can fine-tune voice output using parameters like temperature, top_p, top_k, and cfg_scale to control expressiveness, stability, speed, and contextual relevance. Microsoft's broader offering includes over 600 neural voices across more than 150 languages and locales, with custom neural voice capabilities for unique brand voices. Users can explore these voices and functionalities through Speech Playground and direct SSML calls.
techcommunity.microsoft.com
techcommunity.microsoft.com
