What is our primary use case?
We use it for when an app team or somebody comes to us and tells us that we have a problem with a server, that they're experiencing slowness, or latency, or the like. We like to take two IPs end-to-end. It will give us a server IP and the client IP, and we can plug that into nGeniusOne to hopefully give us some kind of error codes or a breakdown of what's going on from the packet level of the transaction. Hopefully, it gives us an idea of what's wrong.
How has it helped my organization?
The solution gives us increased visibility while conducting an IT deployment, depending on what the deployment is. As long as it's still monitoring in places that we're deploying something - for example, if it's in the DMZ, and it's going over a firewall - we have sniffers and tasks with this product deployed. In that case, we should be able to use it.
Another example would be when we're in the process of doing a lot of backups to the cloud. The teams come to us and they want a certain amount of bandwidth and a certain amount of resources, and they constantly ask us whether it's too much or too little, or can they use more overnight or at certain times. I can go back to my NETSCOUT reports and find out whether they're in trouble or actually have more capacity so they can ramp up their operations. It provides a view into that.
When we actually can use the product, we can see a measurable decrease in mean time to know or mean time to repair. It definitely has been something we wouldn't do otherwise, especially for capacity planning. We will get there when we have more proactive alarming and monitoring in place. It can greatly cut overall troubleshooting time once you know how to use it and it's properly and fully implemented.
What is most valuable?
Its troubleshooting capabilities are the most effective because we have it deployed in and out of our data centers, with our servers on-prem. And even now, going off-prem with Azure, we want to have visibility. For example, if one of our network pipes is getting plugged up by somebody using too much bandwidth, we can use the NETSCOUT tool to examine and find out what is going on.
I like the Dependency Mapping the solution provides, as long as it works. If you have it properly deployed it will. Being able to have dependencies is very critical in figuring out any path, and the more we can have that functionality it's nice because we can see if something's talking to multiple devices. We can see if one is actually the cause, rather than just "seeing blindly."
What needs improvement?
In terms of the single pane of glass view, when we build it out in the nGeniusOne platform, there are multiple tiles and, depending on what we're trying to examine, it doesn't all fit in one single pane of glass. It would be nice to have that functionality, but you really do have to categorize things because there is so much data.
The biggest thing is being able to provide net path. One of the products we use is SolarWinds, and it provides a very cool mapping of an agent from end-to-end. If NETSCOUT could somehow implement that into their design, whether it be sniffer-to-sniffer, or that kind of thing. I know they have some functionality along those lines, but if they could make it quicker and easier to get those net paths, it would be huge. I could quickly plug in problem IPs and get a full hosted view of where it's going from end-to-end. That would be really useful.
Finally, the GUI, the interface, has room for improvement. It's user-friendly to a degree, but when comparing it to other products, such as in the Cisco environment or SolarWinds, I found that I could just fumble my way through those tools very easily without training. Whereas with NETSCOUT, I need training in order to set stuff up because I would never figure that out on my own.
What do I think about the stability of the solution?
The stability has been pretty good. I haven't had any issues with the hardware, for the most part. It's a little tricky working with if you don't go through NETSCOUT for the packet flow switching. Right now, we use Gigamon, which we've had some older iterations of and some issues with. But as far as the hardware from NETSCOUT goes, we've had no issues.
What do I think about the scalability of the solution?
The scalability is huge because certain ISPs have hundreds of these things out there monitoring their deployments, versus our having a few. It's very scalable.
How are customer service and technical support?
Tech support started off poorly a few years ago, when we first implemented this, but I don't think we had the right resources on hand. In the last year, my company has worked directly with an OSC onsite, and the support has been much better.
Which solution did I use previously and why did I switch?
We've actually had NETSCOUT for a long time, but originally it was implemented as a security tool, pre- and post-firewall, to just monitor traffic that way, to see how effective it was.
Now that firewalls have improved, and we use Check Point for that, it's been transitioned to the network team - to where I am - and now we're just using it as an NPM-type solution. It didn't really come in as a replacement. It was more, "Here are some assets that we want to use for network performance," so we're learning how to use it and deploy it better.
I don't know how they came to the decision to use NETSCOUT five years ago, but we kept it because we've had an investment with them.
How was the initial setup?
The initial setup has been very complex. Just understanding our own environment, we definitely needed a dedicated resource, an OSC, to really figure out where we needed to deploy these things, what the capacity we needed to build out was, and what we needed to spend; what we currently had versus what we need. It has definitely been complex.
What about the implementation team?
We've always gone straight through NETSCOUT in terms of the support and the hardware. We have never gone through a reseller.
What was our ROI?
We have seen some initial return on investment, on a small scale. We definitely hope to get more out of it once we implement it properly with the OSC. We're in the early stages.
Which other solutions did I evaluate?
We were looking at some of the Cisco stuff, and LiveAction, and SolarWinds, but NETSCOUT has its own little deep-dive triage packet part of the market that no one really, that I know of, touches. There is definitely still value there when considering.
What other advice do I have?
If you want deep-dive, triage, packet-capture-type data, rather than just using Wireshark, it's very effective for that. It's definitely good for complex troubleshooting. There are other solutions, going into the cloud with the thin clients, and the vSTREAMs and vSCOUTs are definitely good, as is the nGeniusPULSE - I really like the PULSE product. We're not currently using that.
I think nGenius is very useful. You have to know your own environment, and see if it's good for you or not. My recommendation is mixed, to be honest. Depending on what you're looking for would determine whether I'd recommend it or not, which I actually have, to a colleague.
The solution can help us get to root cause more quickly, but not always. It is definitely a good stepping-stone, and when we have the visibility and the deployment properly implemented, it definitely can quickly get to a root cause.
We use the solution for proactive monitoring of remote sites to an extent. We have all of our sniffers, and all the stuff that's TAP-ed is in our central areas that get reported back from remote sites. As long as it crosses over one of those TAPs, it works. We're currently in the process of actually redefining and restructuring our build so that it does give baselines and some proactive monitoring, but we're not there yet.
For responding to issues, it can help the network uptime, especially when it comes to capacity, but as far as actually helping the stability of the network, I don't think it's really done that.
nGeniusOne is a seven out of ten, but improving. Originally, about a year or two ago, it was like a four out of ten for us because we weren't using it properly. When it's implemented properly, and the training is there to use the interface and have it work in your company, and people understand it, it can be very effective. As we do more and get it properly implemented, I think that score can even go up.
Disclosure: I am a real user, and this review is based on my own experience and opinions.