There is room for improvement in the stability. Challenges arise because these open-source tools are mostly intended for Kubernetes and Docker. However, my client uses ECS, and some of the guys on my team don't have expertise in Kubernetes. So, they want the solution to be deployed in ECS, a proprietary service for container management. In that case, it is difficult to deploy in that kind of environment because we have to use a lot of integration. It's not intended for proprietary services, so you have to struggle with configuration a lot. So that's why I'm trying to convince my team to use Kubernetes for control plane tools, monitoring, deploying, and things like that. It's easy to manage these tools.
There are a few features in the solution's enterprise version that are not given in the normal basic version. Visualization-wise, Grafana Loki's dashboard looks a little outdated compared to other open-source visualization tools like Chronograf. Chronograf's dashboard is much more attractive, colorful, and easy to visualize.
There is a need for some change in the alerting types of the product. In short, a few changes in the alert area are needed due to minor shortcomings. It would be better if you could just use Grafana OnCall with images in the solution.
In Grafana Loki, the creation of metrics is not so easy, making it an area that could be made easier. I am on an old version of the tool, and the creation of metrics may be better in the newer versions of the tool.
DevOps Engineer at a comms service provider with 501-1,000 employees
Real User
Top 5
2023-11-16T10:31:37Z
Nov 16, 2023
The product must improve its UI. I have heard that Kibana is more friendly. It might be because of the speed at which the logs are generated, but it depends entirely on how the system is configured.
Information Technology Buyer at BNP Paribas Partners for Innovation
Real User
Top 5
2023-11-03T17:27:27Z
Nov 3, 2023
We encountered certain limitations when it came to alerting, particularly when dealing with specific data sources. An additional beneficial feature we'd like to see is the integration of AI or machine learning for predictive analytics, which could help anticipate production imbalances or potential issues.
My main concern is the recommended production-grade setup. They suggest using tools like Tanka or Jsonnet. They should simplify the process to increase adoption. The architecture is solid, with distinct read and write parts and a caching layer, making it fast. However, setting up a production-grade cluster takes a lot of effort to understand the components and how they fit together. That's where I see room for improvement.
MLOps Engineer at a tech services company with 501-1,000 employees
MSP
Top 10
2023-10-18T09:47:00Z
Oct 18, 2023
Our team encountered challenges with the dashboard preparation, particularly when crafting Loki queries, sometimes referred to as LogQL. These queries lacked intuitiveness. Additionally, we operated on older versions of Grafana, Loki, and the backend host. Being a part of a single deployment chart with Prometheus, we faced several UI bugs that significantly hampered our experience. A notable issue was the absence of immediate feedback upon writing queries. The query language's sensitivity, especially concerning indentation and specific characters, made it cumbersome to draft. Omissions or minor errors led to vague error messages, offering little insight into rectifying the problem. Despite having a foundational query, adapting it to our specific needs was tedious. While our dashboard and its queries functioned adequately, statistics reset when a Kubernetes pod restarted. This resulted in data inconsistencies, such as a continually increasing linear graph suddenly plummeting. Rectifying this issue was challenging due to unclear documentation about modifying Loki queries. There's room for improvement in Grafana's query configuration tool. Although it provides a default setup, it lacks versatility, especially in our older version. A more robust debugging tool for queries would be beneficial. Eventually, we transitioned to Datadog, a more comprehensive yet proprietary solution. Datadog offers integrations and pre-configured dashboards for services like AWS's RDS, which we found lacking in Grafana and Loki. The primary challenge with Grafana and Loki was the significant setup time required to achieve the desired functionalities.
Grafana Loki can improve showing the whole flow of the request. The correlation of requests is not simple in Grafana Loki and can be improved. Currently, creating alerts or alarms on the Grafana Loki dashboard is a little bit complex and should be improved. Grafana Loki's reports need to be more customizable so that I can show in my reports part of the dashboards and some metrics, etc.
Grafana Loki is a powerful log aggregation and analysis tool designed for cloud-native environments. Its primary use case is to collect, store, and search logs efficiently, enabling organizations to gain valuable insights from their log data.
The most valuable functionality of Loki is its ability to scale horizontally, making it suitable for high-volume log data. It achieves this by utilizing a unique indexing approach called "Promtail," which efficiently indexes logs and allows for fast...
There is room for improvement in the stability. Challenges arise because these open-source tools are mostly intended for Kubernetes and Docker. However, my client uses ECS, and some of the guys on my team don't have expertise in Kubernetes. So, they want the solution to be deployed in ECS, a proprietary service for container management. In that case, it is difficult to deploy in that kind of environment because we have to use a lot of integration. It's not intended for proprietary services, so you have to struggle with configuration a lot. So that's why I'm trying to convince my team to use Kubernetes for control plane tools, monitoring, deploying, and things like that. It's easy to manage these tools.
We face some bugs when we install the latest version of Grafana Loki.
There are a few features in the solution's enterprise version that are not given in the normal basic version. Visualization-wise, Grafana Loki's dashboard looks a little outdated compared to other open-source visualization tools like Chronograf. Chronograf's dashboard is much more attractive, colorful, and easy to visualize.
There is a need for some change in the alerting types of the product. In short, a few changes in the alert area are needed due to minor shortcomings. It would be better if you could just use Grafana OnCall with images in the solution.
In Grafana Loki, the creation of metrics is not so easy, making it an area that could be made easier. I am on an old version of the tool, and the creation of metrics may be better in the newer versions of the tool.
The product must improve its UI. I have heard that Kibana is more friendly. It might be because of the speed at which the logs are generated, but it depends entirely on how the system is configured.
We encountered certain limitations when it came to alerting, particularly when dealing with specific data sources. An additional beneficial feature we'd like to see is the integration of AI or machine learning for predictive analytics, which could help anticipate production imbalances or potential issues.
The Docker container partition feature needs improvement as they do not reuse the space and goes into a pending state.
My main concern is the recommended production-grade setup. They suggest using tools like Tanka or Jsonnet. They should simplify the process to increase adoption. The architecture is solid, with distinct read and write parts and a caching layer, making it fast. However, setting up a production-grade cluster takes a lot of effort to understand the components and how they fit together. That's where I see room for improvement.
Our team encountered challenges with the dashboard preparation, particularly when crafting Loki queries, sometimes referred to as LogQL. These queries lacked intuitiveness. Additionally, we operated on older versions of Grafana, Loki, and the backend host. Being a part of a single deployment chart with Prometheus, we faced several UI bugs that significantly hampered our experience. A notable issue was the absence of immediate feedback upon writing queries. The query language's sensitivity, especially concerning indentation and specific characters, made it cumbersome to draft. Omissions or minor errors led to vague error messages, offering little insight into rectifying the problem. Despite having a foundational query, adapting it to our specific needs was tedious. While our dashboard and its queries functioned adequately, statistics reset when a Kubernetes pod restarted. This resulted in data inconsistencies, such as a continually increasing linear graph suddenly plummeting. Rectifying this issue was challenging due to unclear documentation about modifying Loki queries. There's room for improvement in Grafana's query configuration tool. Although it provides a default setup, it lacks versatility, especially in our older version. A more robust debugging tool for queries would be beneficial. Eventually, we transitioned to Datadog, a more comprehensive yet proprietary solution. Datadog offers integrations and pre-configured dashboards for services like AWS's RDS, which we found lacking in Grafana and Loki. The primary challenge with Grafana and Loki was the significant setup time required to achieve the desired functionalities.
The solution has shortcomings regarding security monitoring-oriented features that need improvement.
Grafana Loki can improve showing the whole flow of the request. The correlation of requests is not simple in Grafana Loki and can be improved. Currently, creating alerts or alarms on the Grafana Loki dashboard is a little bit complex and should be improved. Grafana Loki's reports need to be more customizable so that I can show in my reports part of the dashboards and some metrics, etc.