We use Prometheus for monitoring all aspects of our infrastructure end-to-end. That includes servers, virtual machines, databases, caching servers, ELK stack, and our Kubernetes servers. We are users of Prometheus.
Director of Infrastructure and DevOps at Aigent
Offers a stand-alone, user-friendly process with great features
Pros and Cons
- "Prometheus gives us high availability automatically."
- "Lacks the ability to clusterize."
What is our primary use case?
What is most valuable?
The scraping mechanism is a wonderful feature. I've used many other monitoring systems that were mostly client-server-based models including Nagios, Zabbix, and New Relic, among others. With all of them, the server used to get overloaded when the client sent too much matrix, even in the case of a pull or push mechanism of client-server architecture. In my previous organization, we had to host several Nagios individual servers in case one went down. Prometheus gives us high availability automatically and a stand-alone process; if it doesn't run on one server, it runs on another. It's wonderful that the exports run on different servers. They scrape the matrix and then open it to a particular URL for Prometheus to read those metrics and then store them.
I very much like the remote write feature. Prometheus bridges the gap for everyone whether they've come from an old monitoring setup or are into microservices. I also like the concept of dynamic conflict which is brilliant.
We chose Prometheus because it's open source with a lot of documentation and community support which is lacking in other products.
What needs improvement?
The Prometheus community says it's not meant to be clusterized so people shift to solutions like Thanos and VictoriaMetrics. Prometheus could have done that too, it's not complicated. Rather than us having to use a different database, Prometheus could develop its own database a little more so that it becomes a one-stop solution. That would be wonderful.
One of the issues is that dynamic conflict uses regular expressions and it can be confusing for people not familiar with them and the unique specific symbols and line-cut characters.
For how long have I used the solution?
I've been using this solution for six years.
Buyer's Guide
Prometheus
November 2024
Learn what your peers think about Prometheus. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,192 professionals have used our research since 2012.
What do I think about the stability of the solution?
The product is stable. In my previous company, we used Prometheus with Docker Compose and kept high data retention. We didn't have a third database to store the matrix so the container used to go down very often. The issue was not with Prometheus but that there were insufficient resources for the monitoring system.
40% of those in our company need Prometheus for their daily work, including developers. Indirectly, that number goes to 80% when you include those reliant on the reporting. We do everything through Prometheus. We also use Nagios for monitoring bare metals and in a way Nagios monitors Prometheus and Prometheus monitors Nagios. In that way, we're able to monitor two different monitoring systems.
What do I think about the scalability of the solution?
The solution is not meant to scale.
How are customer service and support?
There is good documentation and the open source community offers good support, so I haven't needed to contact customer support.
How was the initial setup?
Prometheus is very easy to set up and is user friendly. It just runs and gets you a very simple UI running on export. Where it becomes complicated is people not understanding the configuration because its support of many exporters means a lot of jobs need to be written in order to use it well.
Deployment time depends on the use case. Our present use case is quite complex so implementation took about a week. We wanted to monitor 10 to 15 clusters so we had to deploy Prometheus on a different environment and ensure that our data was placed at a stable central location. We initially carried out a POC which took less than a day.
What's my experience with pricing, setup cost, and licensing?
The solution is open source.
What other advice do I have?
If a company uses bare metal systems and their product doesn't need significant extensive monitoring, I wouldn't recommend Prometheus. That's not because it's difficult to use but because it doesn't qualify for that use case. If you have a use case where you already have microservices deployed and you want information, then that's a suitable use case. Otherwise, Nagios or any other simple operating system is easier to use. It depends on what kind of product needs monitoring.
In spite of the scaling issue, Prometheus provides almost everything I need. Despite needing to integrate it with other tools, it's seamless and simple to use. I rate this solution 10 out of 10.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
DevOps Engineer at Suse
Used to monitor the health of multiple servers, including their uptime and CPU memory usage
Pros and Cons
- "The most valuable feature of Prometheus is the ease of pulling the metrics."
- "The solution's error handling part could be improved."
What is our primary use case?
We use Prometheus to monitor the health of more than 300 to 400 servers, including their uptime, CPU memory usage, and iowait.
What is most valuable?
The most valuable feature of Prometheus is the ease of pulling the metrics.
What needs improvement?
The solution's error handling part could be improved. The errors that are sometimes shown are not accurate enough to debug. The code base can be improved so that the debugging part is easy if Prometheus is not working. The error should be so that the user can easily understand where the issue could be, and it would become easy to debug.
For how long have I used the solution?
I have been using Prometheus for one year.
What do I think about the stability of the solution?
We had issues with the solution's stability regarding SSL certificates around a year back, but now it's stable.
I rate the solution an eight out of ten for stability.
What do I think about the scalability of the solution?
Whenever a new region comes up on any of the clouds, like Azure, Google, or AWS, we need to add Prometheus servers over there, and the metrics need to be pulled from those new servers. Every month, we increase the number of servers where Prometheus is used. We did not find any challenges, and we are able to view the metrics.
I rate the solution a nine out of ten for scalability.
Which solution did I use previously and why did I switch?
I joined my current company around one and a half years ago. Since that time, we have been using Prometheus. In my previous organization, we used a separate stack for monitoring solutions, including an open-source solution called Telegraf.
How was the initial setup?
The initial setup of the solution is easy because it has pretty good documentation.
What about the implementation team?
We were able to deploy Prometheus, Grafana, and Loki, and we were able to pull the metrics from more than 300 servers in one week.
What's my experience with pricing, setup cost, and licensing?
Prometheus is an open-source solution.
What other advice do I have?
I would recommend Prometheus to other users. If you have to monitor the health of multiple servers, just write an Ansible playbook or a Salt state, and the ease of deployment of Prometheus to any number of services is really easy. We are also using SSL certificates with the solution. With Prometheus, you will not face any security issues. It's a great monitoring solution for pulling the metrics.
Overall, I rate the solution a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Mar 18, 2024
Flag as inappropriateBuyer's Guide
Prometheus
November 2024
Learn what your peers think about Prometheus. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,192 professionals have used our research since 2012.
Head of Operations Engineer at RayanHamAfza
A free and easy-to-deploy solution that enables users to monitor application metrics with ease
Pros and Cons
- "The solution is useful to collect huge metrics."
- "The scalability must be improved."
What is our primary use case?
The solution is used for monitoring the metrics of applications. The applications have multiple functions, and each function should be monitored. I contact the developer teams to add the solution to the applications. I also use Alertmanager. If, for a specific test, a required metric is below the expected rate, the alert manager will fire a notification to the users.
What is most valuable?
The solution is useful to collect huge metrics. We have multiple applications. Collecting metrics from all the environments in my data center using the solution is very simple. The dashboards are very useful.
What needs improvement?
The product must allow users to run multiple instances to get metrics. Thanos is a bit complicated to deploy. It would be better if the solution provided a tool similar to Thanos natively.
For how long have I used the solution?
I have been using the solution for two to three years.
What do I think about the stability of the solution?
The product is stable, but sometimes, when we run heavy queries like line data for the previous 24 hours, the CPU and memory utilization are very high. So, we use the product for shorter queries, like line data for the previous 30 minutes or an hour.
What do I think about the scalability of the solution?
The scalability must be improved. More than 20 people use the tool indirectly.
Which solution did I use previously and why did I switch?
My colleagues use Grafana. Grafana is integrated into Prometheus in our organization.
How was the initial setup?
The initial setup is very easy. It took us an hour to deploy the tool.
What about the implementation team?
It is not difficult to maintain the solution because I run it on Kubernetes.
Which other solutions did I evaluate?
The tool is open-sourced.
What other advice do I have?
Prometheus is a famous solution. It is a useful product for monitoring application metrics. It needs a developer team to monitor the application functions completely. There are different dashboards like SLO and SLI dashboards. It also has an alert manager tool. The tool is important for modern applications because modern applications employ microservices. The product helps monitor all the functions, but it lacks scalability. Overall, I rate the solution an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Software Engineer at a retailer with 10,001+ employees
Plenty of functions, highly scalable, and helps understand application behavior
Pros and Cons
- "The most valuable features of Prometheus are the many functions available. The functions are helpful for understanding the behavior of applications and infrastructure."
- "A slight alteration to the user interface should be made to increase efficiency and streamline the process. Currently, we are utilizing Prometheus to gather and compile metrics and then utilizing Grafana to display them in the form of a graph. However, I believe that Prometheus has the capability to handle both of these tasks on its own, with perhaps the addition of a supplementary plugin. By doing so, the need for utilizing two separate applications will be eliminated."
What is our primary use case?
In order to have a visuality from Grafana, we create the metrics using Prometheus. Prometheus pushes all these metrics to Grafana. After scraping all the metrics and plotting them on a graph using Grafana, then we used Prometheus as middleware for getting all the metrics from all our applications and infrastructure.
How has it helped my organization?
We use Prometheus with Grafana and without either of them, it would not fit our use case.
What is most valuable?
The most valuable features of Prometheus are the many functions available. The functions are helpful for understanding the behavior of applications and infrastructure.
What needs improvement?
A slight alteration to the user interface should be made to increase efficiency and streamline the process. Currently, we are utilizing Prometheus to gather and compile metrics and then utilizing Grafana to display them in the form of a graph. However, I believe that Prometheus has the capability to handle both of these tasks on its own, with perhaps the addition of a supplementary plugin. By doing so, the need for utilizing two separate applications will be eliminated.
For how long have I used the solution?
I have been using Prometheus for a couple of years.
What do I think about the stability of the solution?
We have experienced some performance issues, but I cannot say for certain whether it is due to Prometheus or if it is a result of our infrastructure. During these instances, we have noticed that the mix of metrics can become disrupted for a period of time due to high levels of CPU utilization or other factors. As a result, we are unable to retrieve metrics during these instances. This has been a challenge that I have faced while using Prometheus. It is unclear whether there is a solution, such as a resource optimization, that could help to alleviate these issues. However, it is expected that the utilization of resources should improve as the number of metrics that we scrape increases.
I rate Prometheus an eight out of ten.
What do I think about the scalability of the solution?
I rate the scalability of Prometheus a ten out of ten.
How was the initial setup?
The initial setup should be relatively uncomplicated. I haven't personally dealt with the implementation aspect but based on my previous experiences, we utilized Prometheus and Grafana together in our Kubernetes cluster. Thus, we were able to deploy it with ease, all within one unified operation.
What about the implementation team?
We did the implementation of the solution in-house.
What was our ROI?
We have received a return on investment using Prometheus.
What's my experience with pricing, setup cost, and licensing?
The solution is open source.
What other advice do I have?
Prometheus, similar to Grafana, is incredibly useful and effective, particularly when you become more familiar with its functions and capabilities. By developing a stronger understanding of the functions utilized by Prometheus, it becomes much easier to extract relevant metrics and create visually appealing graphs. Thus, it is recommended that users invest time in learning and mastering these functions for optimal results.
I rate Prometheus an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
DevOps Engineer at Bipolar Factory
Offers great alerting features along with an open-source version
Pros and Cons
- "Stability-wise, I rate the solution a ten out of ten."
- "The UI and GUI are areas of concern in the product."
What is our primary use case?
For a project I am currently working on in my company, I use Grafana, but for the data source part, I use Prometheus.
I use Prometheus in a Kubernetes cluster. I use Prometheus for the processes attached to Amazon EKS.
What needs improvement?
The UI and GUI are areas of concern in the product. The UI part of the tool should be more friendly. More GUI should be added to the solution. The UI and GUI should look good.
For how long have I used the solution?
I have been using Prometheus for a year. I am a customer of the tool.
What do I think about the stability of the solution?
Stability-wise, I rate the solution a ten out of ten.
What do I think about the scalability of the solution?
I haven't faced any scalability issues in the product. Scalability-wise, I rate the solution a ten out of ten.
Around five to ten DevOps engineers use the product. Other employees in the company don't use the solution because it is a product that is used to monitor other clusters. The DevOps team uses the product to monitor clusters.
Which solution did I use previously and why did I switch?
I use Prometheus and Grafana.
I used to use Amazon CloudWatch as an in-built proprietary monitoring and alerting tool in my company. Amazon CloudWatch's use had a certain cost attached to it, but Prometheus is an open-source tool.
How was the initial setup?
The product's initial setup phase was easy.
The solution is deployed on the cloud.
The solution can be deployed in three to five minutes.
What's my experience with pricing, setup cost, and licensing?
Prometheus is an open-source tool.
What other advice do I have?
Speaking about how I have implemented the product for my company's infrastructure, I would say that I have a simple Prometheus in my cluster, which is Amazon EKS' cluster using a Helm Chart. Helm Chart is a package manager for Kubernetes-based applications. I have installed Prometheus and Grafana using Helm Chats, which is very easy to install.
In terms of the benefits of the tool attached to the performance monitoring part, I use Prometheus as a data source. Prometheus helps me to pull my Kubernetes cluster data. Basically, Prometheus helps me as it serves as a one-stop destination for my Kubernetes cluster data. I export data from Prometheus to Grafana, which helps me with the visualization part. In general, Prometheus helps me visualize my data and gives me a structure for my clusters.
I would say that I use Grafana for the alerting feature that helps maintain system reliability. I use Prometheus for my data sources and by adding it. From the data, I create email alerts. The first point of contact for me is Prometheus for Grafana's dashboard and for each and every alerting feature as well. I can say that Prometheus is very useful for me if I consider its alerting features.
I have only used Prometheus to monitor my Amazon EC2 servers and Amazon EKS clusters. My use cases indicate that I have been using Prometheus to monitor Amazon EC2 server and Amazon EKS clusters. I don't use the product extensively for anything else apart from the fact that I use it to monitor Amazon EC2 server and Amazon EKS clusters.
I recommend the product to those who plan to use it.
I would like to mention that if you use Prometheus and if you integrate it into other tools, then it will save a lot of costs as it is a free and open-source product with a strong community. Prometheus will be getting an updated version soon, which is very soon.
I rate the tool an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Apr 11, 2024
Flag as inappropriateInformation Technology Manager at Peponi Schools
Easy to maintain, but it needs to improve the ability to generate customized reports
Pros and Cons
- "The product is easy to maintain."
- "The query language in Prometheus is an area of concern where improvements are required."
What is our primary use case?
In Prometheus, the graphs and everything are too complicated. With Prometheus, you need to sit down and study it properly. Though the product offers a pretty interface, if you don't know what you are doing, then the tool doesn't provide you with customized reports. The product sends its users email reports, which they need to look into.
What needs improvement?
The query language in Prometheus is an area of concern where improvements are required.
The ability to generate customized reports should be made available in the product.
For how long have I used the solution?
I have been using Prometheus for three years.
What do I think about the stability of the solution?
Once the product is properly configured and installed, it works perfectly.
What do I think about the scalability of the solution?
It is a scalable solution.
Around 600 people in my company use the product.
Which solution did I use previously and why did I switch?
I have experience with Sophos. There are some problems with the GUI of Prometheus. The GUI of Sophos is simple and easy to use.
I would recommend others to choose Zabbix over Prometheus.
I rate Zabbix a nine out of ten.
How was the initial setup?
The product's initial setup phase is not easy as one needs to know the Unix language.
Considering that there is a need to update and upgrade the Linux part, the solution can be deployed in an hour. It may take a day for someone who does not know the product's installation phase.
What's my experience with pricing, setup cost, and licensing?
Prometheus is available as an open-source product.
What other advice do I have?
Prometheus for monitoring our company's infrastructure and applications is useful, as it serves as a network monitoring tool that helps you understand the bandwidth capabilities while also ensuring that you get to know who is accessing what at what time of the day. The tool provides details on where the company's network bandwidth is used.
Speaking about how Prometheus has helped in our company's system observability and alerting processes, I think it's just an added tool in our environment because we don't use it as a primary solution. Primarily, my company is dependent on Sophos. The other solutions probably don't give you the results you want, but it all depends on how well you have done your policies. The aforementioned reason may prompt a user to use Prometheus.
Features of Prometheus for metrics collection and monitoring stem from the ability that the product provides to track bandwidth.
It is a task to deal with Prometheus' query language because it is not as clear as you would expect it to be, which poses a challenge.
The integration of Prometheus with Zabbix can be a nightmare, and it can be quite challenging. There are also chances that you may get errors when trying to integrate Prometheus with Zabbix. You have to wait for the next release of Prometheus and Zabbix to be able to install them together.
The product is easy to maintain.
The product works fine in terms of its abilities related to incident detection and resolution, but the product has to do something with the GUI part, which is not appealing enough as someone would not want to sit in front of it and look at graphs.
Those planning to use Prometheus should be aware of the syntax part in programming before using it.
I rate the overall tool a seven out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Feb 28, 2024
Flag as inappropriateSenior Software Engineer at SumUp
Used for observability and analyzing data for business metrics and system metrics
Pros and Cons
- "The resilience of the solution's metric collection is very nice."
- "There is a tool called Prometheus Exporter that doesn't work well."
What is our primary use case?
We use Prometheus for observability and analyzing data for business metrics and system metrics. It helps us with messaging services observability. It also helps a lot with the architecture and scalability of the services. The solution provides system observability regarding Kubernetes information, CPU usage, memory, and whether the system is scaling or not.
What is most valuable?
The resilience of the solution's metric collection is very nice. Although it's very strict about the number of unique metrics for each service, the metrics collection is very powerful.
What needs improvement?
There is a tool called Prometheus Exporter that doesn't work well. I don't know if Prometheus maintains it or if it's an open-source service. When we have Cron jobs from our services, Prometheus needs a live endpoint to collect the metrics.
Since Cron jobs have a time limit, I rely on Prometheus Exporter to collect metrics. When I send metrics for Prometheus Exporter, it doesn't work very well.
For how long have I used the solution?
I have been using Prometheus for four to five years.
What do I think about the stability of the solution?
I’ve never had any stability issues with the solution.
I rate the solution’s stability ten out of ten.
What do I think about the scalability of the solution?
Prometheus is a scalable solution.
What other advice do I have?
I would recommend the solution to other users.
Overall, I rate the solution an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 1, 2024
Flag as inappropriateSr. DevOps Engineer at Seaflux
Has an easy setup process and efficient alerting functionality
Pros and Cons
- "The most valuable feature of Prometheus is its ability to collect metrics."
- "The primary area where Prometheus could be improved is in terms of pricing, particularly when used with managed services."
What is most valuable?
The most valuable features of Prometheus are its ability to collect metrics and its integration with Grafana.
What needs improvement?
The primary area where Prometheus could be improved is in terms of pricing, particularly when used with managed services like AWS. Many clients have found the pricing for managed services steep, prompting them to switch to the open-source version. This switch has led to the adoption of Prometheus with Amazon EKS. Still, it has also resulted in high data transfer costs due to the multitude of microservices and the associated cluster logs. As a result, clients are facing higher bills, primarily driven by increased data transfer rates and bandwidth usage.
For how long have I used the solution?
We have been using Prometheus for two and a half years.
What do I think about the stability of the solution?
It is a stable product.
What do I think about the scalability of the solution?
It is a scalable platform.
How was the initial setup?
The initial setup is easy if you have good knowledge of Helm. It requires setting two to three commands. Without using Helm, you have to create custom file deployments. Overall, it is a simple process.
What other advice do I have?
The alerting functionality provided by Prometheus has significantly improved our incident response time. It enables us to gather data from various instances or ports within our environment, such as ETS.
I advise others to gain essential knowledge of query optimization and Grafana. It is an expensive product if utilized with managed services. It would be beneficial to use the open-source version.
I rate it a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Mar 31, 2024
Flag as inappropriateBuyer's Guide
Download our free Prometheus Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Product Categories
Application Performance Monitoring (APM) and ObservabilityPopular Comparisons
Dynatrace
Datadog
Zabbix
New Relic
Azure Monitor
AppDynamics
Elastic Observability
Grafana
Sentry
AWS X-Ray
SolarWinds Server and Application Monitor
ITRS Geneos
Google Cloud's operations suite (formerly Stackdriver)
VMware Aria Operations for Applications
WhatsUp Gold
Buyer's Guide
Download our free Prometheus Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- When evaluating Application Performance Management, what aspect do you think is the most important to look for?
- APM tools for a Managed Service Provider - Dynatrace vs. AppDynamics vs. Aternity vs. Ruxit
- What solution would you recommend for monitoring traffic utilization of leased lines?
- How Much Should I Budget for an APM Solution?
- Which is the best AANPM product? Should we be considering anything besides for Riverbed?
- Who Uses APM?
- What is your favorite tool for Application Performance Monitoring?
- How does synthetic monitoring differ from real user monitoring?
- Differences between SiteScope and dynaTrace?
- Splunk as an Enterprise Class monitoring solution -- thoughts?