We use Grafana Loki for various verticals including manufacturing, finance, health, and aerospatial sectors. It primarily helps in monitoring security and access to devices. Grafana dashboards are used to track access success and failure and audit commands issued on devices.
I use Grafana for visualizing my metrics, which are taken from Prometheus and node exporter. Grafana can also visualize my Loki, , and the logs is gathered by Promtail. Additionally, I manage everything in our Kubernetes cluster.
As an end user of the product, I would say that Grafana Loki is a good tool for my use cases. Though I use Loki, I mostly prefer Grafana for alerts on metrics. Recently, I deployed Grafana's internal metrics. The product's GUI is easy to use.
The whole idea is to have logging enabled. We should be able to search for each and every pattern. For example, log patterns are very unstructured these days. You can't predict where they're coming from or their fields. Fields can be dynamic and change depending on the version; when people moved from Docker Daemon to ContainerD, the log format changed. Most of the logging was impacted, especially when we built solutions using pipelines like ELK. The advantage of Loki's logging is its ability to process unstructured logs. It creates chunks and uses something similar to grep for pattern matching. It can find any pattern for us.
MLOps Engineer at a tech services company with 501-1,000 employees
MSP
Top 10
2023-10-18T09:47:00Z
Oct 18, 2023
Our system primarily utilizes dashboards to cater to both developers and clients. For our clients, we present vital data, highlighting which endpoints are most frequented. These are the very endpoints that users engage with to retrieve predictions for their machine learning models. We aimed to showcase the popularity of specific endpoints and the rate at which they return varying HTTP codes, such as 200 (success), 500 (internal server error), and 400 (bad request). Clients were keen on pinpointing any anomalies that suggested an endpoint was lagging in its assessment. Such insights were instrumental for both our clients and our team to discern system segments requiring optimization or enhanced alert mechanisms. Incorporating other Grafana features, we integrated with Slack channels to pull metrics directly from system logs, subsequently crafting Grafana dashboards based on these logs. Alongside this, we employed Slack notifications to keep us updated. From the dashboard, I could glean comprehensive data such as the cumulative number of requests, their duration, requests segmented by response codes, the rate of these codes, an overview of multiple endpoints, and the frequency with which different models were invoked. This data encapsulated the essence of the information we aspired to collect.
Grafana Loki is a powerful log aggregation and analysis tool designed for cloud-native environments. Its primary use case is to collect, store, and search logs efficiently, enabling organizations to gain valuable insights from their log data.
The most valuable functionality of Loki is its ability to scale horizontally, making it suitable for high-volume log data. It achieves this by utilizing a unique indexing approach called "Promtail," which efficiently indexes logs and allows for fast...
We use Grafana Loki for various verticals including manufacturing, finance, health, and aerospatial sectors. It primarily helps in monitoring security and access to devices. Grafana dashboards are used to track access success and failure and audit commands issued on devices.
I use Grafana for visualizing my metrics, which are taken from Prometheus and node exporter. Grafana can also visualize my Loki, , and the logs is gathered by Promtail. Additionally, I manage everything in our Kubernetes cluster.
We use the solution for Docker logs monitoring and system-level information.
As an end user of the product, I would say that Grafana Loki is a good tool for my use cases. Though I use Loki, I mostly prefer Grafana for alerts on metrics. Recently, I deployed Grafana's internal metrics. The product's GUI is easy to use.
My company mainly uses the product to ship logs to Grafana Loki, and we also use the tool to grasp some metrics.
We use the solution to collect logs from our environment. We review them if there's any trouble in the system.
In my company, we use it for infrastructure monitoring and API monitoring, and for creating dashboards to visualize the data from this system.
The solution is used to understand the problem that has been taking place in the application board.
The whole idea is to have logging enabled. We should be able to search for each and every pattern. For example, log patterns are very unstructured these days. You can't predict where they're coming from or their fields. Fields can be dynamic and change depending on the version; when people moved from Docker Daemon to ContainerD, the log format changed. Most of the logging was impacted, especially when we built solutions using pipelines like ELK. The advantage of Loki's logging is its ability to process unstructured logs. It creates chunks and uses something similar to grep for pattern matching. It can find any pattern for us.
Our system primarily utilizes dashboards to cater to both developers and clients. For our clients, we present vital data, highlighting which endpoints are most frequented. These are the very endpoints that users engage with to retrieve predictions for their machine learning models. We aimed to showcase the popularity of specific endpoints and the rate at which they return varying HTTP codes, such as 200 (success), 500 (internal server error), and 400 (bad request). Clients were keen on pinpointing any anomalies that suggested an endpoint was lagging in its assessment. Such insights were instrumental for both our clients and our team to discern system segments requiring optimization or enhanced alert mechanisms. Incorporating other Grafana features, we integrated with Slack channels to pull metrics directly from system logs, subsequently crafting Grafana dashboards based on these logs. Alongside this, we employed Slack notifications to keep us updated. From the dashboard, I could glean comprehensive data such as the cumulative number of requests, their duration, requests segmented by response codes, the rate of these codes, an overview of multiple endpoints, and the frequency with which different models were invoked. This data encapsulated the essence of the information we aspired to collect.
The solution fixes system health issues and is useful for performance monitoring of critical infrastructures and business applications.