We had issues for a number of years with our carrier. I had dropped our bonded T1, 3 Mbps, to go to 100 Mbps fibre, minimum, to every location. We have 26 branches. Because of the scale and the magnitude of the capacity, it's fraught with not noticing whether we're getting our subscribed bandwidth, and we're paying a lot of money annually for our MPLS network. When I figured out the players like AT&T, and others in the mix, can have NNI, even though you think you have a 10 Gb connection—my backbone and my data centers are now connected to 10 Gb lines, my backbone is pretty much limitless—I wasn't sure what we were getting.
The scale of improvement from going to fibre, with this institution so used to having 3 Mbps, made everything look better. Except, I knew that when I need to put hundreds of terabytes of data across data centers and other components, I would have very specific expectations, and my intuition was that it wasn't even close. I was pretty much correct. We had video that was having trouble, and that was the "tell." When we started to look deeper, we had massive amounts of packet loss at higher capacities and smaller packet sizes. I've got a year or two worth of research at the highest levels. We needed a product such as the Accedian, that could knock it out—exactly where the issue is—in a matter of moments, and they were right. It's an extraordinary endeavor. I had an intuition that things needed correcting, and I was spot-on.
Accedian is a combination of SaaS and hardware devices. I have eight in my data centers and I have 26 branches, so the total we have is close to 85.
It has application and Layer 7 support, so that's going to be the next evolution. I had built an institution that I wanted to be number-one in Southern California, and I got bored and wanted it to be number-one in all of the US. When I talk to folks that used to work for, or who work for, Citigroup, they want to come work with me because we're doing things like NVMe capabilities of my sub-infrastructure that operates with 100 Gbps port speeds across state-of-the-art Aristas in a way that people can only dream of.
As we prepare for the next five to 10 years, everything is based on network performance, latency, and application support. We have it set up in a way where we test the capabilities of the telco, the main provider, the local ex, the firewalls, and the switches right into our virtualized infrastructure in every single one of our locations. We virtually eliminate any finger-pointing. We know that if anything goes into maintenance due to an outage, and the failover is supposed to occur, we can instantly test it.
We called out AT&T on a 1-Gbps NNI that was supposed to support 60 folks—which in and of itself was a problem, because we're supposed to have 700 Mbps, guaranteed. When they failed over to a 100 Mbps, they completely saturated it and ran out of bandwidth. We were able to call them on all of this in real time. When we thought there should be 10 Gbps per 60 clients, we identified that they had unnaturally put all 60 through a 1-Gbps line.
The value is for networks on which hundreds and hundreds of thousands of dollars are being spent, where we have this level of intel. And we don't actually have to move around the network at a point in time, the way the world operated for the first 20 or 30 years of networking. We literally send traffic during the day and off-hours. We schedule these tests for repetitive performance, and the results are delivered in a digestible PDF report so that a non-technical person can see what passes and what fails.
It has saved hundreds and hundreds of thousands of dollars. You can pay that money, but there's no guarantee you're getting that capability.
We also can tell instantaneously whether or not we need to consider justifying more throughput or if we don't need what we have. There are probably several ways to look at what that is worth to an organization. However, if you have a system that's out in the cloud world, whether it's budgeting or mortgages, try to put a dollar figure on the capability to ensure the systems are working beautifully at all times.
In addition, Skylight has improved the interaction between my network, data center, and application teams, 100 percent. The layer side of it will be the icing on the cake. I will tell you the user experience before the user can tell you the user experience. We're fixing things for the applications team, and they don't know how or why. We're taking things that, in the mortgage world, historically had epic low rates in production, and doubling them. Just this week, I did something which changed a process that used to take between 30 seconds and a minute and a half and made it real-time, instantaneous. It's extraordinary.
Skylight brings the firewall and security teams to be very in-tune with the network teams, the telco teams, and the carriers that do the local. Any one of those could be finger-pointing at the other. We pinpoint exactly what the issue is, we tell them where it is, and it's pretty irrefutable. There are a time savings and there is collaboration.
You start buzzing away with NetFlow out of SolarWinds, and they'll tell you 17 reasons why it's something else. It's like playing a good game of chess. You think about every move two years in advance and, when it comes down to it, you end up with $50,000 worth of free gear, including, but not limited to, removing traffic shaping and policies that you know are strangling the network. I liken it to having the very best medicine for bronchitis or asthma or any type of breathing condition. This would be the medicine you would want. It opens up the complete "breathing" of the network, right down to every component, whether you're streaming video or doing other things.
You can simulate VoIP issues. As we move the organization to enterprise SIP, my biggest concern in my data centers is that if we ever had one-direction audio, we would never be able to figure it out. It would just take an extraordinary amount of time. Now we can simulate it, understand it, and resolve it in really short order. In fact, I have an engineer who came from LA Fitness, with 720 locations, and he said he took over a year and a half to figure out a problem, and he could have done it in under a month with this software.
When we moved small packets over certain routers, they would absolutely fail, and the world would tell me, "Oh, well, that's the way it is," and I said, "That's not possible." I proved that with higher caliber gear. We were able to move packets flawlessly down to 0.1 and 0 percent packet loss, which is unheard of, and maintained it. I proved it and showed the result. And then I did that again to ensure that the memory and the processing capabilities of the next-gen firewalls that are in every single one of my locations, are capable. And that was right through my primary firewall, to make sure they're capable of sending and adhering to certain traffic loads.
When I joined the company four-and-a-half years ago, I changed everything on a massive scale, to the point where if a vendor calls up and says something may not be running as expected, I politely explain to them that it's not even possible and that they need to take a look at their coding. I took apart every single piece of the puzzle, including four brand-new core switches that run the entire institution, where before we had three that were 10 years out of date; 120 switches, all Cisco, all brand-new. There was no stone unturned. When we run Accedian, if something isn't right, I can tell you exactly what it is and why.
In terms of reducing mean time to respond, it's done so by years. I don't even know how to estimate it. There are things that could never have been solved. When they tell you that they have put Quality of Service on your network, it could take you a lifetime to figure out and identify that the packet traffic and the tagging is converted because you have poor policy that converts everything into a particular tag. No one will believe what I'm saying in terms of saving "years to a lifetime," but you just have to see what this stuff can do. When used properly, there's no telling what you can do. I can say "from a year-and-a-half down to a month" and that's a reasonable metric.