What is our primary use case?
We have everything in Kubernetes. We're basically moving everything from the cloud into Kubernetes - inverting the cloud. We have all that built for the CIT pipeline and have our tools within the cluster.
This is to support application development. The application side is always within the cluster. We have a security cluster. So everything is there. We have a database within the cluster as well. We don't need a managed database. There's a cloud database due to the fact that we use Kubernetes database. Everything goes into the cluster.
It makes it easy for us to be consistent across different environments, including development environments or in Oracle environments, as everything runs within the cluster.
What is most valuable?
The solution allows you to work on and from multiple clouds. You can use Google's cloud, or mix and match clouds across suppliers.
You can split into regions within your own cloud.
The deployment of the cluster is very easy. You just click a button and it's deployed, or just run a simple command and it deploys itself. You don't have to go through the steps of installing the cluster yourself. It's already deployed and managed.
The master of the cluster is also managed by Google. If there are any updates, they are responsible to handle that. It just takes a little bit of a load from our task load. You don't have to manage the master, or the version of the cluster yourself.
You don't have to think about the installation process. They take care of the underlying infrastructure deployment and managing the versioning of the cluster. When we need to update, it's simple. They'll help us to easily, smoothly update those cluster nodes. You don't have to deal with that either.
When it comes to the Google Cloud, the Kubernetes advantage that's there for machine learning is that they have a CPU, which is a central processing unit, which is much faster than GPU. If the clients are willing to pay for it, we'll run the machine learning jobs within the Kubernetes cluster, then connect to Google CPU, which gives us the ability to finish the job much, much faster.
What needs improvement?
It's maybe a controversial topic, as Kubernetes itself should be just your bottom layer. However, within your own engine, you expect to do more with time. Since we're putting so much into the cluster, it would be nice if some of this stuff was already done, baked into the cluster.
Our critique is that we have to do too much work to get the cluster production-ready. Most people just start it and think that's production. That's not really production. That's just bootstrapping the cluster, with all the tools that you need.
A lot of people rely on cloud tools, or a cloud-built system, to get going. We would like to have that baked into the cluster. Due to our usage pattern of the cluster and how heavily we use it, our expectation is to have more tools baked into the cluster. There should be more emphasis on tools developed immediately from the cluster to support application development versus relying on third-party vendors, like Jenkins.
The third-party vendors have to adapt to Kubernetes, and that creates a problem, as there's always a delay. Third parties don't have much incentive to do anything right away. That means we have to wait for these guys to catch up. We don't have a big enough team to actually change every open source code, as there's so much of it.
For how long have I used the solution?
We started using Kubernetes in 2015, around the time it started. Whenever Google launched their tools is about the time we started. Before that, we used Kubernetes, however, we were deploying it ourselves.
What do I think about the stability of the solution?
The solution is stable.
The solution is cloud-native and every cloud is using basically the same version. That's what makes it easy for us to move between clouds. Google wants users to integrate with their own cloud storage and security, however, which is where issues can arise.
It allows you to create private clusters. There's no competitive advantage for cluster clouds right now, which is good for us, as it's a uniform-looking ecosystem that allows us to move between clouds easily.
What do I think about the scalability of the solution?
Kubernetes is designed to scale horizontally and vertically as well. It scales quite well.
We have unlimited scaling through horizontal scaling. We can add more Kubernetes nodes. When applications do grow, we need to maybe horizontally scale applications or our databases. We just kick off another node when we need however much memory CPU and just keep on scaling it. Obviously, you pay for it, however, scaling is extremely easy.
A lot of time we automate the scaling as well. Based a little bit on AI and cloud automation processing, we detect the CPU usage or the GPU usage and if we exceed a certain threshold, the cluster automatically adds another data node, so it's self-serving. So scaling is almost automated around cases that we do.
The system knows by itself how to scale dynamically. The dynamic elastic scaling is baked into our systems as well. When people do use, for example, the machine learning special cluster, that one goes automatically from zero to 23 nodes depending on the users. A lot of times, we do shut it down, with just no usage. When people kick off a job, it automatically spins up on a new cluster node and deploys the job, gets another job, and spins up another one. It dynamically grows. We have to allocate pools that we have of an increased cluster in Google Cloud and it works well.
How are customer service and technical support?
We basically are able to solve our own problems on our end. We don't need the assistance of technical support. We might have used it once in the past six or seven years and that is it.
How was the initial setup?
The initial setup is very straightforward. Everything is basically done for you. It's nice and easy.
We can do the cluster management. There are cluster administrators who take a more serious role. They are responsible for the disks backing some of these applications, the databases, deployment of these tools, the infrastructure tools, et cetera.
What about the implementation team?
We have application developers which work with the administrators to actually deploy the application, as you still have to just be meaningful in the cluster, and add some sort of business logic. Those have to work with our administrators so we get that done and we do support, and a lot of different tools, monitoring tools for visibility. It's important to us, because if we do, we follow a microservice architecture across applications. They're like little black boxes and we have to be able to see inside of them.
What's my experience with pricing, setup cost, and licensing?
Multi-cloud is a sort of an expensive endeavor as the tools are overpriced. We're looking at options that aren't based on Anthos, which is Google's multi-cloud solution.
While you pay money to Google, they also take a piece of the action as well.
CPU is very cheap, however, GPU is very expensive. If you want to iterate on your data client's tasks within a Kubernetes cluster, it will cost you.
There is no licensing cost. You pay for the cloud and you pay for what you use based on the CPU and RAM usage based on the VM, the virtual machines. The cluster is still made up of computers, so you pay for the computers that are backing the clusters. If you kick off a Kubernetes node, which has three nodes in your cluster, you have to pay for each of these nodes, these computers, these virtual machines that you get bootstrapped with. You just use the machine time as with any cloud and you get a price in Google for the machine type and your machine type is defined based on your CPU and RAM usage. If you want to have 60 GBs of RAM, you pay for that RAM or for CPUs.
The same thing is true if you ask for a GPU computer, as most of the virtual machines don't come with a video card unless you ask for it. Then you have to pay for that the computer and the video card
What other advice do I have?
We're actually certified as well Kubernetes vendor.
We're using version 1.19. The most up-to-date is 1.20. We're never on the latest version, we're always like a version behind, or even two versions behind, to give them time to sort through their issues. We're using 1.19 in both Azure's, Google Cloud's, and EKS, however, EKS might be two versions behind, maybe.
Most of the time we're deploying in, as a private cluster within the cloud. It's isolated from public infrastructure. That's for security reasons. We don't want our cluster to be exposed to the public internet.
We also have a hybrid deployment Azure on-premises. This is just to make things easier for integration purposes. On-premise, it's connected to the cloud and then we can just use the same tools to be Kube-Native source. We develop the same tools for Kubernetes and then we can just deploy Kubernetes on-premises or in the cloud, it doesn't matter.
We also are doing multi-cloud as well, and we're deploying from Google Cloud into AWS.
With Azure, we have one giant cloud right now. That way, we can partition a cluster and see multiple clouds and multiple visions. If Google Cloud goes down for whatever reason, as it happened, two years ago, due to bad configurations, too many clusters in a cloud, we're covered. We do multi-cloud as the solution is critical and we can't afford to have it go down.
We are basically are a full-service company. We do everything for our clients - including application development and everything that entails.
I'd advise users to take security seriously. Don't just deploy things on the internet. Make sure your cluster's secure. You want to be able to tell your clients that you have a secure implementation of a cluster. That requires a little bit of cloud set up with every cloud to create a private network, private subnetwork, manage the ingress and egress, so input and outputs of the requests coming into your cluster.
These are things you have to think about when you deploy, just initially before you get started. All the clouds support it, you just have to know how to set up your VPC, virtual private network connection tier with every cloud and how to set up subnets to isolate your cluster to specific subnets, so it's not exposed on the internet, it's private and then any requests coming from the internet have to go to your load balancer directly to your cluster.
However, if you manage that and some peoples' requests going out of your cluster, you won't be able to manage those as well, since they're on NAT and a cloud router as well. So you know what's coming in, what's going out. You can monitor your traffic coming in and within your cluster.
These days a lot of people just use the containers directly from third-party sources or public repositories in the docker containers in which the Kubernetes cluster runs and those could come with malware. You want basically, in the main cluster, to have security policies implemented for every cluster. You don't get that from your cluster loggers. You have to get that from third-party vendors. This is where the competition comes with the Kubernetes.
In general, I would rate the solution at an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Google
Disclosure: I am a real user, and this review is based on my own experience and opinions.