What is our primary use case?
We use Turbonomic to evaluate all of our virtualized clusters. Initially, we were only using Turbonomic for our long-term VMware stacks. Now we are monitoring VMware ESXi 7 and Nutanix AHV stacks. On the server side, we have 400 VMs. We don't evaluate the VDI side because we have 1,100 seats, so it's too expensive. We made a special contract on ELA for vRealize Ops for VDI on that side. It wasn't horrible. It's just the bare minimum to show us if there's a problem in the stack.
We mainly use Turbonomic as a heat map, but we aren't drilling down into the performance of individual applications like Kubernetes. That's Docker or Swarm, but we use other tools to monitor the transaction levels, etc., instead of Turbonomic.
Turbonomic is our overall heat map in our NOC. We fire it up, and when we see a red flag, we dig into it, and off we go, but the basic application components do not have our Dockers linked to them. It's just mainly working on the surface of the virtualization stack itself.
Our infrastructure is solid enough that I get a VDI call about every three weeks. My server farms are built like tanks, so it takes a lot to take them down. We can sleep well at night. Everything is on-prem. The only cloud solutions we use at the college are SaaS systems. We don't put much in the cloud because cloud environments are too vulnerable to hacks and exploits.
We're going from a silo system to HCI — from 450 hard disks to hybrid flash. While we undergo a significant infrastructure change, we're using Turbonomic to watch VMware because it has aged, and our migration isn't happening the way we want. We will probably reevaluate when the next contract is up for Turbonomic instead.
Once we switch to pure Nutanix, we will reevaluate Turbonomic. I will probably keep it because management is used to Turbonomic's reporting. That saves me much OPEX time building those reports out of Nutanix by hand. I've been here for 16 years, and my CIO has been here for 17 years. They're used to the reports we've been developing over the last decade. We developed them using VMTurbo. We set the standard with that first tool for reporting.
We use Nutanix Prism Central to manage everything on the Nutanix side, but Turbonomic provides ancillary information that gives me a holistic view of reporting and more features that Prism Central doesn't cover. Turbonomic provides linkages, visual aids, graphs, charts, etc.
I'm the one who uses it. It's up on my NOC screen. We log and monitor it pretty much every day. Then, once a month, it generates reports on its own. In that sense, it's used daily or monitored daily. We watch what it's reporting every day on the heat map. Regarding issues and such, maybe every couple of weeks we have something pop up that we look at.
How has it helped my organization?
Turbonomic helped us with cluster projections. We have different-sized hosts in a single cluster. I have two-socket and four-socket hosts sitting in a cluster, so the impacts aren't easy to understand in aggregate. Turbonomic helps to evaluate what will happen in hypothetical configurations. I can forecast the effect of dropping one server and adding another. If I drop a pair of 48 cores and add a single 96 at a different gigahertz, will that be adequate? It can tell you if you need to add more cores to manage your server hardware purchasing.
It also assists us in evaluating performance risks. The dashboards show what current risks are happening, and we use the planning features to see the what-ifs. We check the heat map daily. If something pops up there, we check it out to prevent issues from happening down the chain. It's mainly on the VMware side with the older VMXs. We haven't found anything on the Nutanix side to be worried about.
Turbonomic has helped us address performance degradation under VMware. It identifies when there's a bottleneck in the storage line, so we can start moving some virtual disks around to different ones. It helps in the older silo structure. The performance degradation is on the VMware or the fiber channel SAN side. Some of the SANs are nine years old.
It is able to identify points before we even noticed them. We're meeting all our SLAs because it never gets to the point where they catch something. They might say, "Oh, it seems a little slow," and then they'll return from lunch, saying, "Oh, it's okay again."
We log into it in the morning and let it sit up in the NOC. We take a peek when it shows something. We'll check it out if it's red, but it'll usually clear up if it's yellow. For example, all systems might run at 110% immediately before registration closes while students try to get their last class for their senior year registered before the other students. We'll return to our normal 20-30% usage in about an hour and a half.
They won't notice a thing because we'll be moving to more of a Kubernetes Docker-style system with Nutanix Carbon. I will probably try to integrate that with Turbonomic. We will probably connect Turbonomic deeper into that stack because that will be able to pull and spin up new Dockers automatically on hardware and not within anything else, giving the server room to spin up another Docker. Theoretically, I've got room for about 600 more containers, and we currently use 15.
We're centralized IT. I use Turbonomic mainly as a showback because we don't charge our different departments. There technically is no charge in our current Red Hat licenses, and that's picked up. We pick that up and get requests in. There is no self-service here.
The instructor says, "I need 400 cores and two terabytes of RAM to run my analysis." I'm like, "That's how they run it on a supercomputer. We don't have those here. Now, if your research grant wants to buy us one, sure, we'll set it up. Tell us where the half-million dollars is, and we'll set it up for you." There's no self-service here, but we use it for a showback.
We had Turbonomic load-balancing all our clusters, and we did not let VMware load-balance our clusters because of the algorithms. Their marketing and share algorithms were much more precise than VMware's because I had disparate-sized servers.
VMware liked to put a heavy load on my little boxes and leave my big boxes alone, or it stuffed the big boxes full and left the little boxes alone. Turbonomic keeps everything about even. Their algorithm for load balancing was much cleaner until the ESXi 7 than VMware. That made the hardware more cost-effective because I didn't have little guys sleeping in a corner someplace sucking up hardware, power, and cooling while not doing any work all day.
Resource starvation has never been an issue for use. We run different resource pools, and we've never had any service hit 100%. I have redundancies and reserve capacities needed to weather any storm. We use Turbonomic primarily to monitor and maintain equal resources on all servers.
We've never had a server hit 100%. I might have one hit 80% periodically before they moved something around. We've been in the VMware game since 2.X back when a monster server had four cores and 32 gigs of RAM. We've been virtualized over 80% for the last 12 years. We've been heavily virtualized for over 80% of the previous decade. We knew virtualization was the way it was going and went for it.
It reduced our operational expenditures because I have reclaimed some of the time typically spent generating reports. It's part of our system, and we just use it. Turbonomic is part of our network operations center. The dashboard is on my screen, so I can see if the indicators turn yellow or red. I can address the issue before it gets to the point where I'm getting calls from the service desk.
What is most valuable?
I like Turbonomic's built-in reporting. It provides a ton of information out of the box, so I don't have to build panels for the monthly summaries and other reports I need to present to management. We get better performance and bottleneck reporting from this than we do from our older EMC software.
We look at the resizing recommendations in the reports, but we don't follow all of them because we know the ebb and flow of our systems. It shows us the storage array evaluations for our storage devices. We have some older primary channel VNX systems.
What needs improvement?
The management interface seems to be designed for high-resolution screens. Somebody with a smaller-resolution screen might not like the web interface. I run a 4K monitor on it, so everything fits on the screen. With a lower resolution like 1080, you need to scroll a lot. Everything is in smaller windows. It doesn't seem to be designed for smaller screens.
When I change the resolution to 1080, I only see half of what I would on my big 4K monitor. It would be annoying to have to scroll to see the flow chart. They have a flow chart that goes top to bottom like a tree. On a lower resolution, it might be nice if that scrolls horizontally because it's long, narrow, and tall. It's only three icons wide, but it's 15 icons tall. I think it would be helpful to have the ability to change that for a smaller screen and customize the widget.
For how long have I used the solution?
We've been on Turbonomic since version 2.
What do I think about the stability of the solution?
I haven't had it fail, especially with the 8.X line. Other users report bugs, but we've never encountered any of those issues in our own implementation. Once, I didn't patch it for a year, and it kept running. No Windows system can be up for 365 days and not need a reboot. This solution is rock solid.
What do I think about the scalability of the solution?
We scaled Turbonomic up but not out. We didn't try to add more nodes of Turbonomic, but we increased the size of the current VM.
How are customer service and support?
I rate IBM support a nine out of ten. I haven't talked to the head engineer for Turbonomic in nine years because he moved up, and they got a real support tier going. I contact their support probably once or twice a year. They typically solve the issue within an hour.
Nobody is ever perfect. Perfect is when they call me to let me know there is a problem before I realize it. That's my old life at Hewlett-Packard, but I don't want to talk about what that support contract costs.
How would you rate customer service and support?
How was the initial setup?
Setting up Turbonomic is pretty straightforward if you take it slowly. If you assume you know the answer and click "next" automatically, you will run into an issue here or there. You must properly set up your permissions in the vCenter and Nutanix before you install. I always make a Turbonomic-specific local ID so it can't be used anywhere else.
It's a nice security practice using a local account for Nutanix and one for vCenter to connect proper permissions. After the prerequisites are in place, it is relatively straightforward. It used to be pretty cumbersome for the licensing. I haven't done a fresh installation since 8.1. It wasn't as simple before 8.1, and it seems like they've streamlined it even more in 8.5. I don't think you'll face any challenges if you follow the instructions and set up the accounts it needs to connect to from the appliance ahead of time.
It took about 15 minutes to configure Turbonomic for Nutanix because I connected through Prism Central. That includes verification and making sure the data is coming in. For VMware, it took no more than 10 minutes per vCenter that I added. I spent probably half an hour reading instructions to ensure everything was right before I started.
What about the implementation team?
I've been doing this for so long. We've been doing VMware since 2.X. It was all done on-premise. We had support back in the day if there was an implementation problem. We had support on the phone with us until it was resolved. We had more problems back when it was still called VMTurbo. When I called support, I would get a senior engineer who actually wrote code for VMTurbo instead of a support desk.
What was our ROI?
The ROI for us is a reduction in operating expenditures. We fix problems before they become issues that our clients see. I don't need to spend an hour a month putting together reports for upper management. We've identified the canned reports they'd like to see and Turbonomic builds them in PDF files. I'm still working 50-60 hours a week, but that's not 51 and 62.
What's my experience with pricing, setup cost, and licensing?
I don't know the current prices, but I like how the licensing is based on the number of instances instead of sockets, clusters, or cores. We have some VMs that are so heavy I can only fit four on one server. It's not cost-effective if we have to pay more for those. When I move around a VM SQL box with 30 cores and a half-terabyte of RAM, I'm not paying for an entire socket and cores where people assume I have at least 10 or 20 VMs on that socket for that pricing.
We're not in the cloud, so we pay per VM. I have the license set at $500. We have around 400 VMs. I don't have to go over the price break at $500 that we got. We're a public institution, so I can't talk about the exact cost, but I can talk about strategy. We discovered that the one VM at $500 was better than getting four plus a couple of 10 packs. It was better for us to go ahead and call 500 in and use that as our leverage to bring the cost down.
Which other solutions did I evaluate?
We evaluated VMware, vRealize, and Gotti. At the time, Turbonomic was called VMTurbo. We chose VMTurbo because it was the best fit. Gotti didn't do enough, and vRealize was way out of our price range. They later changed their price structure, but we stayed with Turbonomic.
What other advice do I have?
I rate IBM Turbonomic a nine out of ten. Before implementing Turbonomic, you should do your research. Check the documentation to see if Turbonomic's processes make sense for what you're doing and if your current setup will handle all the different aspects of Turbonomic. If your current solution does 85% of what you need, and Turbonomic does 87%, I don't think that's enough to switch products. If your solution meets 80% of your need, and Turbonomic does 98%, why wouldn't you change?
What do you need to do? I can flip a quarter or half-dollars instead of quarters. If all you're doing is flipping dimes, do you need something that flips half-dollars? I'm frugal. I worked for a 501(c)3 nonprofit most of my life before I came to this college.
I look for bang for my buck and stability. I used to describe to other system admins that I may not have the flashiest new Volvo turbo diesel truck that goes up the hills. My Peterbilt with a CAT diesel may smoke and have a little rust on it, but in the middle of a subzero blizzard, it's making that hill while yours is gelled up at the bottom.
Take a long, hard look at the pre-generated reports and how it holistically checks your system from top to bottom through the tree to see if that's a good fit for you. You can't change that tree. If the homepage tree doesn't work for you, then Turbonomic won't work.
There's not much I know you can do to change that. If that tree makes sense to you, then look at the reporting. Look at how it evaluates load balancing based on shares instead of just the overall weight that VMware does. Turbonomic uses market shares.
They turn it into a cost market share to help adjust. It will tell you if a CPU load is heavy. It will give you recommendations to adjust the size. We're not going to move it right away because what is mission-critical is over here, and we don't want to impact that. VMware looks at how heavily the CPU is being utilized in the VM and says, "Well, I'm going to slap that over here," arbitrarily.
Turbonomic has a share you set up. Think of it like stock market shares. That share went up, but it's not a blue-chip stock. We try to move it over to where the blue chip stocks would take a hit. We move it someplace that's a lesser value, so to speak, once machines are of lesser value.
Which deployment model are you using for this solution?
On-premises