Right now, we've been creating a lot of custom dashboards for the application teams, so they can see all their application and server performance. We've been trying to do a lot of the integration with their management packs, so you can basically try and see everything from end to end.
It improves the resolution time for troubleshooting, but we also can predictably see issues as they start to happen so we can jump on it before it really becomes a problem for the end users.
Even though I'm not sure how often they're finding issues – I don't really deal with that too much because I'm not in the operations side – I know that one guy that we've been using heavily for a VDI environment now, and they've been able to track down a lot of problems as they start.
They haven't started using many of the new features for version six, but it's one of the things they're looking at, trying to mess with.
We do not have any use cases where we avoided outages or reduced outage time. We're not using it for any actual alerting; it's just the dashboard and troubleshooting really.
We do use it for capacity management. Well, I was when I was doing the job; I was using it for capacity management. There were a lot of cases where we could save on storage but, because of political reasons, we weren't allowed to reclaim a lot of the space that was being wasted. It was a good tool to show that waste was happening. We weren't doing any VM provisioning on the array side, but because of vROps, we were able to prove that we have a lot of waste here; we needed to start VM provisioning somewhere. They got that implemented on the array side.
We'll see improvement in the phase when we're trying to get more people to use the tool. As a VMware admin, I find it useful for capacity planning. That's the big one for me. We're trying to get our transition more on the engineering side, so I don't really use it as much now. We're trying to get the operations team to kind of embrace it a little bit more.
This is a difficult area to address because I'm not using it much anymore. I don't know. A lot of the big areas for improvement, they've already addressed with six; the ability to integrate with vRealize Orchestrator, adding some automation to it.
Some of the thresholds and what not are a little tricky to set up, and that's where we're struggling right now; our operations team isn't really managing those properly. Right now, I don't even know if they have a process to set up the thresholds anymore. Basically, they are just relying on the out-of-the-box setup. Every time they come to me and say, "We've got these alerts that are red," I say, "Did you actually validate that it's a problem?" Nine times out of ten, it's not. It's just out of the norm, and they don't really understand that.
With version six, stability is really good. We're really enjoying six. Five was stable. Six is a lot easier to use. That's the big one.
Scalability is really good, especially with the new model in six. Five was okay. It wasn't too bad, but you're limited to a couple of VMs. Now, you can just add new VMs.
I actually haven't had to use technical support. A couple of the other guys have, and it seems to be really good.
We were using Foglight a long time ago. I barely touched it, but I remember it being just a giant pain to manage. It's hard to configure. To me, it seemed kind of convoluted.
Actually, both five and six were pretty easy to setup initially.
You have to play with the thresholds and make sure they meet your needs. If you see something red, don't freak out because it could just be an abnormal spike from 10% to 20%.
Good job brother, your review met almost the quality management cycle. Congrats