I was planning to use the tool for real-time analysis in terms of data processing and real-time analytics workflows. The real-time IoT data comes through with a few challenges, and that is for one time, so it is more like a Kafka topic. I want to actually use multiple Kafka topics where one of them can be directly fed into the data pipeline, another one can be fed into the real-time alert system, and the next one can be fed into machine learning.
The most valuable features of the solution revolve around areas like the latency part, where the tool offers very little latency and the sequencing part. The sequencing part actually helps to aggregate things in a way that I don't need to write another function or kind of sequence it, and I write an aggregate function to figure out the maximum value in the last ten samples.
One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure. Apache Kafka has another mask on it, so if users can have a direct service, like Grafana, that can actually be used as a stand-alone tool with Grafana cloud, or you can use a mix of AWS and Grafana, so there is not much difference with it. I expect Apache Kafka to have Grafana's same nature.
The product's support and the cloud integration capabilities are areas of concern where improvements are required.
I have been using Apache Kafka for a year.
Stability-wise, I rate the solution an eight out of ten.
Scalability-wise, I rate the solution an eight out of ten.
Around four people in my company use the product.
I did not interact much with the product technical support team. I did not have dedicated support that responded to all my queries since I was using the product's free version. I rate the support a seven out of ten.
I have worked with Databricks. I use Databricks and Apache Kafka simultaneously.
The product's deployment phase is neither complex nor straightforward. As the software has evolved a lot, users can actually keep it even simpler by opting for a plug-and-play model.
The solution is deployed on an on-premises model.
The solution can be deployed in two or three days.
I was involved with the tool's installation process.
I cannot comment on the tool's ROI since I did not use it for production purposes.
I was using the product's free version.
I did not come across any scenarios involving fault tolerance because when it comes to the issue data consistency issues, like missing or incorrect value of data are actually part of the system where the data is being fed. Nevertheless here, when it comes to the missing values, I never tried the option, especially whenever a value is missing, that can allow one to impute the value with another parameter.
Speaking about if I incorporated any emerging data stream streaming trends in Apache Kafka workflows, for example, utilization of AI, I would say that I use it as a local system, so if I have an EC2 server where I kind of read the sample and then populate the regression and reintegration model on top of it, but that is done locally and not on the cloud.
I recommend the product to those who plan to use it. I like Kafka and Flink, and I want to actually create a system in AWS mainly for real-time streaming so that I don't need to worry about multiple data copies.
Considering the improvements needed in the product's support, and the cloud integration capabilities, while looking at the simplicity during the installation phase, I rate the tool a seven out of ten.