What is our primary use case?
We [my company] use it to run a large workload. We have a set of security scans we want to perform, and we distribute them over a full day, that’s over 24 hours. We use it to orchestrate all the steps necessary to perform those tests.
What is most valuable?
It’s essentially an orchestrator. So, we get all those properties we want from an orchestrator. Particularly, we like the fact that the whole process is durable, which is very useful to us. The fact that you can split it up into multiple smaller steps, called activities, and store state at every single activity is something we have made a lot of use of. For example, this allows us, in case of any failures down the road, to stop the process midway and resume it later. That’s another feature that’s been really useful.
We like the fact that it integrates very well with the programming language. It’s not completely transparent; you know you’re using Temporal because you have to import the SDK into the programming language itself. But it’s done in such a way that it’s really easy to write and fits well within the language. Personally, I like that the main abstraction, workflows, allows you to follow domain-driven designs super easily. In your workflow, you can essentially speak your business language and not have to worry too much about Temporal because it’s abstracted away so nicely.
One last feature that’s super useful is that retry policies are built into the Temporal system. For example, if one of your activities fails for multiple reasons, you can configure how you want to handle your failure cases in the activity itself with a retry policy. You can say, “Okay, I want to retry this later,” and configure the cadence. This step is really configurable, it’s built into the system, and it’s something we have made a lot of use of. So, that’s pretty much a big picture summary.
What needs improvement?
The actual user interface is still in its early stages. It’s very basic. Users don’t really have a complex permission model yet. Users don’t really have ways to automate things like, for example, provisioning the Temporal namespace via Terraform. Users can’t do that yet. Users still have to do it manually.
Another thing I remember is the certificate rotation. Users can’t configure that to be done automatically. Users have to do it manually in each of the Temporal namespaces, which is actually super annoying for us. We’ve been talking with them about whether or not they were going to add a feature that would support that, and they said that it’s in the backlog, and they’re still working on it, but they don’t really have a timeline for it yet.
So, the operations on the interface are still in very early stages. Users can’t really do that many things when it comes to administration. The essential things users can do, but the more convenient things, some of them are lacking. So that’s a downside.
When you run your activities, it would be nice if you would not only see the latest error you get from an activity. When you do an activity, there are retries. Retries happen for various reasons, which means you can execute the same activity multiple times and get different errors. Now Temporal only shows the latest one. It would be nice if it showed all the errors. I’ve been reading a bit about it, and as far as I understand, it’s a limitation of how the system is built. But from what I understood from my colleagues, they talked to Temporal, and they said that there is possibly going to be a way to do that. I’m not sure whether that’s true or not, but it would be really nice. So that’s something that came to mind.
Another aspect of it that I don’t like is that Temporal Cloud is not friendly for smaller users. I wanted to include Temporal in some of my projects, and Temporal Cloud would have been a nice addition because managing the self-hosted cluster, I did not find easy. There’s a lot of setup in doing that. It would have been nice for me to use Temporal Cloud, but the pricing model doesn’t really allow for that. If I remember correctly, there’s $200 customer support fee you have to pay for Temporal if you register for Temporal Cloud, which is obviously way out of my budget for a self-hosted user that just wants to run a few workloads. It would be nice if they made some changes to their pricing model so that not just companies have an incentive to use Temporal Cloud. Because, at the moment, there’s no way for me to do it without having to pay a lot of money.
For how long have I used the solution?
I have been using it for last nine months now.
What do I think about the scalability of the solution?
We run quite a lot of volume, so the product is pretty good on that side. Mainly, your scalability will be in your infrastructure because Temporal doesn’t run any workloads. Temporal only stores your state. The processes that are actually running those workloads are your workers, and those are running on your infrastructure and are going to port a queue.
So, basically, if there are bottlenecks in your workload, if they need scaling, it’s a change you would do on your workers. Now, obviously, there’s the question of whether the Temporal cluster itself would scale because if you run a lot of things at the same time, then you’re going to have a lot of writes to the Temporal cluster and a lot of changes to their cloud databases. But we haven’t really had problems with that. They scaled pretty well according to our needs.
For the self-hosted cluster was much slower, but the cloud one has met our demands in terms of scalability.
It’s multiple users and multiple locations now. It’s quite a large volume in our case.
How are customer service and support?
All tech support has the way we interact with them is via Slack. We have a Slack channel in which we talk with them. They’re pretty quick.
They were helpful. There’s nothing really to complain about there. They’ve always helped us whenever we asked them, and they regularly check in to see how we’re doing. If they see issues or weird traffic patterns on their side that they consider not necessarily best practices, they reach out to us. I’ve only had good times with tech support.
I’ve really nothing to complain about the technical support. I just go for it and I’ve only had good experiences with the tech support.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Some of my colleagues might have used Amazon Workflows, but I haven’t. From what I can gather, they’re fairly similar products, but Temporal turned out cheaper, which was one of the criteria they used to select it.
How was the initial setup?
There have definitely been some problems we’ve had with it. For us, it wasn’t just take it out of the box and it worked right away. We had to do a lot of configuration to get it working in the state we wanted. It was a lot of back and forth with Temporal.
First, we used the self-hosted version, so you can self-host it yourself. But that didn’t scale too well for us. But we migrated to the cloud at some point.
We mostly worked with it on the cloud, but it wasn’t all good right out of the box, even after we migrated to the cloud. We had certain issues, and some of those issues were because we didn’t know all the features it had to provide. Some of those issues, even Temporal, we didn’t know how to fix because we think they were due to the very large volume of work we were scheduling in a very short amount of time.
In our use case, because we do a lot of work with it every single day, we had to spend quite some time configuring it and getting it to work how we wanted. But that’s not necessarily the case for other user flows that use Temporal and have a very different traffic pattern. Those were much easier to get working well with minimal configuration.
I was not involved in the initial step. I was involved in subsequent steps, which did the transitioning between the self-hosted and the cloud setup, but the cloud setup was already there when I came. I took part in provisioning some new namespaces, but it was mostly ClickOps. From my perspective, getting started with Temporal Cloud was probably more work on the business and product side rather than on the engineering side. I think that on the engineering side, the amount of work you have to do to get your cloud account running is really minimal.
Three resources were involved in the migration process and the setup of the cloud environment.
From maintenance point of view, we have to maintain the workers. But from a maintenance point of view, it’s alright. It’s a SaaS solution, so Temporal do most of the hard work. One maintenance aspect is the certificate rotation. That’s really annoying to do.
The maintenance for the self-hosted cluster is much more complicated and one of the reasons we’ve migrated. But, that’s also due to the fact that, being a security company, we have to respect strict security guidelines, which means that we have to modify some of the images that they provided because they didn’t respect those guidelines yet. For example, Temporal images aren’t FIPS compliant, and we have to be FIPS compliant.
What was our ROI?
For us, it ends up being quite costly. But it’s still probably more cost-effective for us to do it using Temporal. So, it’s a bit expensive, and it would be nice if the cost didn’t scale linearly because, at the moment, they have something like $25 per million actions, and then that keeps decreasing given the amount of actions you have, which is okay. But in the end, it’s still linear.
If you build a solution yourself, you will have a lot of maintenance costs, a lot of costs for the engineers involved in doing that. And, specialized resources. That’s the product they’re building, and they’re investing a lot of time with it. So you might not get away with something as reliable.
Now, unless you need an orchestrator, unless you need durability, for example, if you’re a payment service provider, then you absolutely need that durability. But if you’re just an early startup doing some basic CRUD operations, so you’re at the very beginning, then you probably don’t need the durability. So you definitely have to take into account whether or not you actually need that durability when you decide on the solution.
If this business logic can fail and if that’s a use case we can tolerate, then you probably don’t have to go and purchase Temporal. But if you have a business operation that absolutely cannot fail, we absolutely cannot tolerate the failure; this operation has to run to completion, then you should definitely consider an orchestrator like Temporal.
What other advice do I have?
I might be biased because I really like this technology. I’m not going to go for the ten because of the downsides but Temporal would be a strong eight out of ten, definitely nine. But if they consider improving the weaker points, I would definitely see this as one of the strongest stacks we have currently.
I’d recommend it, but be cautious before going for the tech just because it sounds nice. Users definitely have to lay out their use cases and figure out whether they need an orchestrator in the first place becauses it’s not a Swiss army knife. It’s not a tool that fits every use case. But for our use cases, for example, I definitely recommend it. It's a really good product.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.