What is our primary use case?
We are using it for disaster recovery for our day-one applications that need to be up first, upon failover.
How has it helped my organization?
We previously had our Microsoft SQL Servers set up as clustered pairs, with the primary in one data center and the secondary in the other, and they were staying in sync via SQL Server Log shipping. That was not a very efficient way to get SQL servers failed over. There were also some things that weren't replicated through log shipping, such as the SQL Server Agent jobs that are defined on the server, or the custom permissions that are set up for the different roles. Zerto was able to replicate the entire server, including the jobs and the permissions, and eliminate the need for us to have that secondary server. We were able to break all of our SQL clusters and just have standalone SQL Servers. It helped to increase our efficiency with failover and reduced our overall compute and storage footprint around SQL by about 40 percent.
When failing back or moving workloads, the solution saves time and reduces the number of people involved. The time from the initiation of a failback to the completion is about five minutes for us. We've also made some tweaks in the DNS to help that to update and replicate quickly so that we're not waiting for that, even if the resource is available. As for the number of people involved, for SQL especially, it used to require getting the SQL team involved and they would do everything manually. Now, anybody can just click through the recovery wizard and perform the failover.
Our savings from Zerto are around licensing and how we structure our current environment. We were able to save money with our on-prem deployment, but we don't use it for cloud.
And in terms of downtime, every time we test a failover it's non impactful to operations, because we're able to do testing in an isolated environment. Before, if we wanted to test our failover processes it was going to create a production outage. That is no longer the case. Before, when we were doing regular DR tests, I would estimate the cost of the downtime to have been about one weekend per quarter. That's the time we would have to take to do that. Only if we were to do a live failover as a test, which would probably not be done more than once a year, would we really have to worry about impacting any operations.
What is most valuable?
The most valuable features would be the
- granular configuration of your SLAs
- built-in WAN compression as part of the replication
- easy wizard-based failover.
The granularity enables us to failover specific workloads instead of an all-or-nothing type of scenario, where you have to move your entire IP block and your data center, or you have to move large chunks of VMs. Those situations also make it prohibitive to test effectively.
The replication piece with the built-in WAN compression is important because the network circuit that we send our replication traffic across isn't actually behind our normal WAN accelerators. We were able to use Zerto's built-in WAN acceleration to help those workloads compress.
The failover is important because that way I can delegate initiating a failover to other people without their having to be an expert in this particular product. It's easy enough to cross-train people.
Continuous data protection is Zerto's bread and butter. They do all of their protection through your journaling and that continuous protection gives you countless restore-point opportunities. That's extremely important for me because if one restore point doesn't work, because it is a crash-consistent restore point, you have so many others to choose from so that you really don't have to worry about having an app-consistent backup to recover from.
Zerto is also extremely easy to use, extremely easy to deploy, and extremely easy to update and maintain. The everyday utilization with the interface is very easy to navigate, and the way in which you perform testing and failover is very controlled and easy to understand.
What needs improvement?
The replication appliances tend to have issues when they recover from being powered off when a host is in maintenance mode. Sometimes you have to do a manual task where you go in and detach hard disks that are no longer in use, to get the replication appliances to power back on. There are some improvements to be made around the way those recover.
My other main inconvenience is fixed in version 8.5. That issue was moving virtual protection groups to other hosts, whenever a host goes into maintenance mode. That's actually automated in the newer version and I am looking forward to not having to do that any longer.
For how long have I used the solution?
I've been using Zerto for coming up on four years.
What do I think about the stability of the solution?
My impression of its stability is very positive. It doesn't seem to have any issues recovering after you shut down any of the particular components of the application. It seems everything comes back up and comes back online well.
Sometimes the replication appliances will stop functioning, for one reason or another, and most of the time a power cycle will resolve that. But anytime that I do have a sync issue, support will generally be back in touch with me within the first half hour after opening a ticket. They're very responsive.
What do I think about the scalability of the solution?
The scalability is able to take on any size environment. We don't have a huge environment here. We only use it across 20 hosts, 10 at each site. They're very large hosts. If you have more than a certain number of virtual disks protected on a single replication appliance, the replication appliance will automatically make a clone of itself on that host to accommodate the additional virtual disks. It seems to be built to scale in any way that you need it to.
While our hosts are very large hosts, we don't have any current plans to extend that deployment because we have capacity to grow within our current infrastructure footprint, without having to add on resources.
How are customer service and support?
I rate their technical support very highly. They're very responsive. Usually within the first 30 minutes of opening the case, someone has tried to reach out to me. I will just get a screen share, or a reply to my call with an answer, or a KB article. I have a very positive impression of their support.
Which solution did I use previously and why did I switch?
We were using Site Recovery Manager for several years, and I always struggled with keeping that functioning and reliable. Every time something changed within the vCenter environment, Site Recovery Manager would tend to break. I wanted to switch to a DR product that I could rely on.
In addition to Site Recovery Manager, we were also using NetApp SnapMirror. We are still using that for our flat file data which is non VM-based. We have Rubrik as our backup solution because, while we replicate our backups, there's not any automation behind bringing those online in the other sites. So it's a manual process to do disaster recovery.
We were having to utilize those solutions to do the failovers for our day-one application in SQL and they were inefficient and ineffective for that. Zerto was able to come in and target those workloads that we needed better recovery time for, or where we needed a more aggressive replication schedule. Zerto is supplementing those other solutions.
Zerto is easier to use than the other solutions. There's definitely more automation and there are more seamless failover activities.
How was the initial setup?
When I deployed the solution, it took certainly less than a day to get it up and running. The upgrade process has been fairly seamless and painless, in the past, as we have gone from one version to the next. That includes some of the features they've enhanced, where it automatically updates the replication appliances as well as the management pieces.
We have two data centers and they're both Active-Active for one another. Our deployment strategy for Zerto was to stand up a site server at each one, pair them together, and then start identifying the first workloads to add into Zerto protection. We started with our SQL environment.
I was the only one involved in the deployment. If I had questions I would ask my account team. My sales engineer and the account rep are both very knowledgeable. But I actually didn't need to open a support ticket as part of the deployment. It was very easy and straightforward.
About five of us utilize Zerto. I am the infrastructure engineer, focusing on the compute side of the house. We've got a storage engineer. My manager is an applications delivery manager who uses it. We've got another senior network engineer who focuses more on the runbook side of things, and he uses it. And my backup, who is our Citrix guy, is starting to use it.
Zerto doesn't really require any particular care and feeding. Whenever a new version comes out that has features sets, I'll decide when I'm going to update it and do that myself. It doesn't really even require a support call. It's pretty straightforward. For each management appliance, updates have taken 10 to 15 minutes, in the past. And it's just a couple of minutes for each replication appliance.
What was our ROI?
Our ROI is quite significant. The SQL cost savings alone would be in the hundreds of thousands of dollars per year. That's due to the fact that we don't need to have our SQL clustering set up as an always-on cluster, which would need to be a higher tier of Microsoft licensing. We're able to use SQL standard for everything, and that wouldn't be possible without a third-party like Zerto to do the replication and failover.
What's my experience with pricing, setup cost, and licensing?
Get the Enterprise Cloud license because it's the most flexible, and the pricing should come in around $1,000 per VM.
Support is an additional cost. We are currently doing three years of support. There's an additional 15 or 20 percent of overhead during each year of additional support for each license.
What other advice do I have?
Definitely take the free trial and put it through its paces, because you really can't break anything with it, given the way that you can do the testing. It gives you a good opportunity to play with the tools without having to worry about causing any problems in the environment.
We have plans to evaluate the solution for long-term retention. I'm going to start testing some of their features once we upgrade to version 8.5, and then we'll evaluate if it makes sense to do that or not. We do have other backup products that we're evaluating alongside of that though.
The solution has not reduced the number of staff involved in overall backup and DR management. We already run a very lean engineering team.
I got what I expected. I'd actually been trying to bring the product in since 2014 but I kept not getting budget funding for it. I feel satisfied with what I ended up with and I'm glad that we were able to move forward with the project.
Which deployment model are you using for this solution?
On-premises
*Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.