Try our new research platform with insights from 80,000+ expert users

Amazon Polly vs Microsoft Azure Speech Service comparison

 

Comparison Buyer's Guide

Executive SummaryUpdated on Oct 8, 2024
 

Categories and Ranking

Amazon Polly
Ranking in Text-To-Speech Services
1st
Average Rating
7.0
Number of Reviews
2
Ranking in other categories
No ranking in other categories
Microsoft Azure Speech Service
Ranking in Text-To-Speech Services
3rd
Average Rating
8.6
Reviews Sentiment
7.8
Number of Reviews
2
Ranking in other categories
Speech-To-Text Services (2nd)
 

Mindshare comparison

As of November 2024, in the Text-To-Speech Services category, the mindshare of Amazon Polly is 40.2%, up from 38.5% compared to the previous year. The mindshare of Microsoft Azure Speech Service is 23.2%, down from 23.7% compared to the previous year. It is calculated based on PeerSpot user engagement data.
Text-To-Speech Services
 

Featured Reviews

SB
Helps modify text-to-speech by controlling speech patterns, but putting more tags gives an error
We can play text-to-speech when we need to play some prompts inside Amazon Connect. Any number we provide will be played as digits when we play text-to-speech. By using SSML tags in a conversational manner in Amazon Polly, we can convert those digits into numbers. We can use the SSML tags in Amazon Polly to modify text-to-speech by controlling speech patterns and behaviour.
Abhishek-Rana - PeerSpot reviewer
Offers ease of use and the availability of documentation is great
The simplicity impressed me the most. We just needed a single API key. The documentation was also great. I developed the AI application using Unity, a game engine that uses C#. Then, I searched online for instructions on how to use it. I found Microsoft's GitHub repository, which provided the necessary code for integrating the Speech Service into Unity with C#. The ease of use and the availability of documentation made the process smooth and impressed me the most. The documentation and boilerplate code [a template of code] was available, which I incorporated into my application with modifications. Initially, the code functioned so that when a button was clicked, the microphone would activate and recognize my speech. One of the benefits was the ability to see my spoken words visually on the screen as I spoke. For example, if I said "I am Abhishek Rana," I could see the sentence appear in real-time. When I stopped speaking, it automatically recognized the silence and ceased, sending the text for further processing. So, the real-time translation feature has helped me a lot.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Amazon Polly is useful because it's helpful to hear the words on top of it when I can't take in information in a general way. Sometimes, it's very taxing if I'm trying to read cases. They have the neural voices, and they're so realistic. You don't even know that a person is not reading to you, making things much better. I know that they do have the ability to provide you with your own lexicon that's personal to you. I like that you can adjust the pitch and the speed of the voice because some people talk way too fast. Or if you're reading, I read slowly, so that's always helpful. One of the functions that I find helpful is that when reading material on the web, it's like it has its own browser. You go to the URL, and you don't have to read the whole thing, and you can stick the cursor on the place where you want it to start. Then if you want it to skip over something, you put it somewhere else, and that's ideal for reading case law because you skip around a lot. You don't really read it from start to finish. It helps if someone's going to read all those citations because they definitely want to be able to skip that."
"We can use the SSML tags in Amazon Polly to modify text-to-speech by controlling speech patterns and behaviour."
"The documentation and boilerplate code [a template of code] was available."
"Useful text-to-speech and speech-to-text features."
 

Cons

"When you put more tags inside Amazon Polly to define break time and instruct the speech to be conversational, sometimes it gives you an error."
"The price could be better. I wish it weren't so expensive to do because it's really cool. I would love to see them have lexicon packages of them like, this is for lawyers, this is for accountants, and it's going to have a lot of things in it. I also think they could do a better job at showing use cases other than telemarketing or contact center stuff like bots that are very commercial. I know that's where the money is, but it's such a huge hole that's missing for people with disabilities that are even worse than mine. Some people cannot see or hear at all, but they're not just cognitively impaired."
"It can improve based on the native language."
"Lacks a voice recording option."
 

Pricing and Cost Advice

"The solution has a pay-as-you-go pricing model, where you must pay according to your usage."
"The price could be better. Neural voices are so realistic, and I want to say that they have it so that you can try to tell where the voice is coming from or something like that. But if I have more than one, it's so expensive to have to listen to a bunch of cases on my phone and have the neural voice read to me. It really wouldn't be worth it. It'd be paying probably more than what I make in the case. Right now, I'm on the free tier, and I think the number of minutes that you get is reasonable as long as you're not doing this all the time and you're using it judiciously. I have some credits that I think I can use, but I don't know how fast they'll go through."
Information not available
report
Use our free recommendation engine to learn which Text-To-Speech Services solutions are best for your needs.
816,406 professionals have used our research since 2012.
 

Top Industries

By visitors reading reviews
Computer Software Company
16%
Financial Services Firm
10%
Manufacturing Company
8%
University
7%
Computer Software Company
18%
Financial Services Firm
10%
University
8%
Manufacturing Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
No data available
No data available
 

Questions from the Community

What is your experience regarding pricing and costs for Amazon Polly?
The solution has a pay-as-you-go pricing model, where you must pay according to your usage.
What needs improvement with Amazon Polly?
When you put more tags inside Amazon Polly to define break time and instruct the speech to be conversational, sometimes it gives you an error.
What is your primary use case for Amazon Polly?
We use Amazon Polly to check the SSML tags to see how they play the prompt.
What do you like most about Microsoft Azure Speech Service?
Useful text-to-speech and speech-to-text features.
What is your experience regarding pricing and costs for Microsoft Azure Speech Service?
I'm a college student. I signed up for the Microsoft Azure portal using my college account, so I got a $100 credit. I've used it for various services from the portal. I have used different services...
What needs improvement with Microsoft Azure Speech Service?
For general use cases and vocabulary used in normal, everyday language, it was able to recognize those. However, it can improve based on the native language. Apart from English, other languages and...
 

Also Known As

No data available
Azure Speech Service, MS Azure Speech Service
 

Overview

 

Sample Customers

GoAnimate, Duolingo, Bandwidth
KPMG
Find out what your peers are saying about Amazon Polly vs. Microsoft Azure Speech Service and other solutions. Updated: October 2024.
816,406 professionals have used our research since 2012.