Try our new research platform with insights from 80,000+ expert users
Mukulit Bhati - PeerSpot reviewer
CTO at InsightGeeks Solutions Pvt.
Real User
Top 10
Impeccable and impressive throughput with brilliant availability
Pros and Cons
  • "Its availability is brilliant."
  • "The support on Apache Kafka could be improved."

What is our primary use case?

We use Apache Kafka for patching real-time data that we receive over a data transport layer and for putting the data into Apache Kafka. From Apache Kafka, we use several applications to subscribe to topics from different applications that we serve directly to browsers. Additionally, we use these applications inside our solution and have Apache Kafka Stream, which is connected to MongoDB.

Since we receive data in real-time consisting of IoT devices, running vehicles, their locations, their states, and their VNs, the solution is helpful.

What needs improvement?

The product could be improved with proper documentation. Proper documentation should be the SSE. We have a challenge with configuration, so it isn't easy to configure a standalone Apache Kafka on the premises. It needs to be set up on-premises and surveys being provided in the market want to be excluded. Hence, being a developer and configuring Apache Kafka is very hard. It is user-friendly, but initially, we found it challenging. Improving the documentation in this solution would be much better if documents were provided on GitHub for different things. As the market is growing, Spring solution is working hard to get products in the market so when Python, React JS, and Node.Js came, they were lacking. But today, Spring Boot has a solid framework. So the support on Apache Kafka could be improved, but finding some configurations with Spring Boot isn't easy.

For how long have I used the solution?

We have been using this solution for over three years and are currently using the latest version.

What do I think about the stability of the solution?

The solution is stable, and the most fantastic thing about it is its throughput. For example, I have tried MQs, which also have Apache Kafka Streams. So the throughput of Apache Kafka Stream is impeccable and impressive.

Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.

What do I think about the scalability of the solution?

The solution is very scalable, and its availability is brilliant. We have approximately 32,000 people on our customer base.

How are customer service and support?

We do not have any experience with customer service and support.

Which solution did I use previously and why did I switch?

We have tried different MQs, but the subscription and charting available on this solution are better. We have used Queues previously, but this solution is more stable, so we chose it.

How was the initial setup?

The initial setup is dependent on the individual. For example, it would be straightforward if a person practices these things a lot and understands the documentation correctly. However, since most people prefer examples instead of reviewing documentation, it would be easy to set up if they find steps on the internet but difficult if they do not have examples.

What's my experience with pricing, setup cost, and licensing?

I rate the pricing for this solution an eight out of ten. It could be a bit cheaper.

What other advice do I have?

I rate this solution an eight out of ten. It is good, but the documentation could be improved.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Stuart-Cook - PeerSpot reviewer
CEO & Founder at a tech consulting company with 11-50 employees
Consultant
The message bus capabilities and throughput are good, but it needs better documentation
Pros and Cons
  • "It seemed pretty stable and didn't have any issues at all."
  • "We struggled a bit with the built-in data transformations because it was a challenge to get them up and running the way we wanted."

What is our primary use case?

We used Kafka as a central message bus, transporting data from SNMP through to a database. Some of the processing in between was handled by other components.

How has it helped my organization?

We built a solution for a client and the client was happy with the solution.

What is most valuable?

The message bus capabilities, basically sending messages to it, and the way it handles events or messages is pretty good. The throughput was good. Generally, it was a good component.

What needs improvement?

We struggled a bit with the built-in data transformations because it was a challenge to get them up and running the way we wanted. There was a bit of a learning curve. It may be that we didn't fully grasp the information.

Also, the documentation covering certain aspects was a bit poor. We had to trawl around different locations to try to find what we needed. When we were able to find documentation on transformation, for example, there wasn't a good set of documentation examples we could use, and the examples we had weren't quite meeting the need. Better examples would've helped us.

For how long have I used the solution?

I used this solution for about a year and a half. 

What do I think about the stability of the solution?

It seemed pretty stable and didn't have any issues at all.

What do I think about the scalability of the solution?

I don't know how many people were using it on the client's side, but we had a four-person team doing the development work. 

What about the implementation team?

Our team handled the deployment in-house.

What's my experience with pricing, setup cost, and licensing?

Kafka is an open-source solution, so there are no licensing costs. There are third-party companies who support and provide add-ons to Kafka, but we didn't need to use any of those. Confluence, for example, provides plug-ins for Kafka. 

Which other solutions did I evaluate?

There were other solutions, like Apache MQ, but there were a number of components we looked at that were based around being a message bus, and Kafka was the winner from that review work.

What other advice do I have?

The documentation can be a challenge. There are quite advanced capabilities of Kafka, like the transformations that you can build to modify the data as needed. We found that the biggest challenge was documentation and being able to gain the knowledge of exactly how to do stuff. We also struggled on the transformation, but other components were fine, so some parts are good, and some parts are bad.

I would rate this solution as an eight out of ten. 

Which deployment model are you using for this solution?

Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.
Rémy NOLLET - PeerSpot reviewer
Data Exchange Architect MQSeries at Decathlon International
Real User
Multi-use, stable solution that requires some external support
Pros and Cons
  • "It is a useful way to maintain messages and to manage offset from our consumers."
  • "I would like to see an improvement in authentication management."

What is our primary use case?

We utilize Apache Kafka in several areas, including financials, logistics, and client management to name a few.

How has it helped my organization?

We used to lose some of our messages when we integrated them in bulk, this solution has stopped that happening.

What is most valuable?

It is a useful way to maintain messages and to manage offset from our consumers. 

What needs improvement?

I would like to see an improvement in authentication management.

For how long have I used the solution?

We have been using the solution for around four years.

What do I think about the stability of the solution?

The stability is good; the solution operates on our clusters without a big impact.

What do I think about the scalability of the solution?

It is easy to scale.

Which solution did I use previously and why did I switch?

We used to use a different solution, but our increased throughput meant we needed a product that would allow for a larger queue.

How was the initial setup?

The initial setup was complex for us because we built it internally. This meant that full deployment took around a month.

What about the implementation team?

The implementation was carried out in-house.

What other advice do I have?

I would recommend that other businesses do the deployment themselves, but manage the tool with the aid of a service provider, rather than in-house.

I would rate this product seven out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer1052868 - PeerSpot reviewer
Principal Technology Architect at a computer software company with 5,001-10,000 employees
Real User
Events and streaming are persistent, and multiple subscribers can consume the data
Pros and Cons
  • "With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions."
  • "Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that."

What is our primary use case?

It's a combination of an on-premise and cloud deployment. We use AWS, and we have our offshore deployment that's on-premise for OpenShift, Red Hat, and Kafka. Red Hat provides managed services and everything. We use Kafka and a specific deployment where we deploy on our basic VMs and consume Kafka as well.

We publish or stream all our business events as well as some of the technical events. You stream it out to Kafka, and multiple consumers develop a different set of solutions. It could be reporting, analytics, or even some data persistence. Later, we used it to build a data lake solution. They all would be consuming the data or events we are streaming into Kafka.

What is most valuable?

With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions.

What needs improvement?

We are still on the production aspect, with our service provider or hyper-scalers providing the solutions. I would like to see some improvement on the HA and DR solutions, where everything is happening in real-time. 

Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that.

For how long have I used the solution?

We've been using Apache Kafka for the past two to three years.

What do I think about the stability of the solution?

Kafka is stable. It's a great product. 

What do I think about the scalability of the solution?

We did some benchmarking, but we are still looking further to scale up some of the benchmarking and performances. So far, it meets all our business requirements. We are just developers, so everything goes to the clients, who will deploy it at their scale and use it for their end customers. So were are looking at it from a developer's perspective. Those who are developing the products are working on this.

How are customer service and support?

We haven't really contacted technical support, but some of our clients have subscribed to support from the vendors. We generally look for open-source solutions. From there, we try to figure out if there are any issues. There's a good online community where you can ask questions.

How was the initial setup?

We were able to deploy and use it with no problems for our use case. We didn't find it so complex. We work with so many applications, databases, Postgres, and so many other things, so we could manage it easily. We deployed Kafka in a few hours. We have an infrastructure team and DevOps. Those teams are pretty capable, and they've completely automated the whole deployment. It always takes time the first time you upgrade any application, not just Kafka. We might discover some issues, such as configuration, parameters, compatibility, etc. Once that becomes standard, it is stable, and then they only need to replicate it to the different environments or different developers groups. We have a sophisticated process.

What other advice do I have?

I rate Apache Kafka eight out of 10. There are so many products on the market, so my advice is to consider if Kafka suits your business requirements first. If it's suitable, the next step is to check whether all the technical requirements are met. If everything checks out, I would say that Kafka is a relatively stable, sound, and scalable product, so they can try it out. 

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2150616 - PeerSpot reviewer
Lead Data Scientist at a transportation company with 51-200 employees
Real User
Top 5
Offers a free version but needs to improve the support offered to users
Pros and Cons
  • "The most valuable features of the solution revolve around areas like the latency part, where the tool offers very little latency and the sequencing part."
  • "One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure."

What is our primary use case?

I was planning to use the tool for real-time analysis in terms of data processing and real-time analytics workflows. The real-time IoT data comes through with a few challenges, and that is for one time, so it is more like a Kafka topic. I want to actually use multiple Kafka topics where one of them can be directly fed into the data pipeline, another one can be fed into the real-time alert system, and the next one can be fed into machine learning.

How has it helped my organization?

The most valuable features of the solution revolve around areas like the latency part, where the tool offers very little latency and the sequencing part. The sequencing part actually helps to aggregate things in a way that I don't need to write another function or kind of sequence it, and I write an aggregate function to figure out the maximum value in the last ten samples.

What needs improvement?

One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure. Apache Kafka has another mask on it, so if users can have a direct service, like Grafana, that can actually be used as a stand-alone tool with Grafana cloud, or you can use a mix of AWS and Grafana, so there is not much difference with it. I expect Apache Kafka to have Grafana's same nature.

The product's support and the cloud integration capabilities are areas of concern where improvements are required.

For how long have I used the solution?

I have been using Apache Kafka for a year.

What do I think about the stability of the solution?

Stability-wise, I rate the solution an eight out of ten.

What do I think about the scalability of the solution?

Scalability-wise, I rate the solution an eight out of ten.

Around four people in my company use the product.

How are customer service and support?

I did not interact much with the product technical support team. I did not have dedicated support that responded to all my queries since I was using the product's free version. I rate the support a seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I have worked with Databricks. I use Databricks and Apache Kafka simultaneously.

How was the initial setup?

The product's deployment phase is neither complex nor straightforward. As the software has evolved a lot, users can actually keep it even simpler by opting for a plug-and-play model.

The solution is deployed on an on-premises model.

The solution can be deployed in two or three days.

What about the implementation team?

I was involved with the tool's installation process.

What was our ROI?

I cannot comment on the tool's ROI since I did not use it for production purposes.

What's my experience with pricing, setup cost, and licensing?

I was using the product's free version.

What other advice do I have?

I did not come across any scenarios involving fault tolerance because when it comes to the issue data consistency issues, like missing or incorrect value of data are actually part of the system where the data is being fed. Nevertheless here, when it comes to the missing values, I never tried the option, especially whenever a value is missing, that can allow one to impute the value with another parameter.

Speaking about if I incorporated any emerging data stream streaming trends in Apache Kafka workflows, for example, utilization of AI, I would say that I use it as a local system, so if I have an EC2 server where I kind of read the sample and then populate the regression and reintegration model on top of it, but that is done locally and not on the cloud.

I recommend the product to those who plan to use it. I like Kafka and Flink, and I want to actually create a system in AWS mainly for real-time streaming so that I don't need to worry about multiple data copies.

Considering the improvements needed in the product's support, and the cloud integration capabilities, while looking at the simplicity during the installation phase, I rate the tool a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Barista Brewing Espresso at Linkedln
Real User
Great horizontal scaling, design with library simplicity
Pros and Cons
  • "Good horizontal scaling and design."
  • "Lacks elasticity and the ability to scale down."

What is our primary use case?

Our primary use case of this solution is for data integration and for real-time data consumption. I'm a senior staff engineer for data and infrastructure and we are customers of Apache. 

What is most valuable?

I love the simplicity of the library and the design as well as the architectural concept which is like horizontal scaling.

What needs improvement?

When compared to other commercial competitors, Kafka doesn't have the ability to scale down, the elasticity is lacking in the product. The other issue for us is the delayed queue, which was available to us in the commercial software but not in Kafka. It's something we use in most of our applications for deferred processing and I know it's available in other solutions. I'd like to see some tooling support and language support in the open source version. 

For how long have I used the solution?

I've been using this solution for four years.

What do I think about the stability of the solution?

The stability is good. 

What do I think about the scalability of the solution?

The solution scales horizontally and scales better than its competitors. We have around 400 to 500 microservices consuming this cluster and the company has around 600 employees. We have four different verticals, each with around 100 engineers with 100 to 150 microservices. 90% of the microservices have a touchpoint with Kafka.

How are customer service and support?

I think the community is very good and will respond if you raise a ticket. We also use external third-party libraries that were built in GitHub. It would be good to have some direct support from Apache.

Which solution did I use previously and why did I switch?

Four years ago we were using Rabbit MQ but we switched to Kafka because Rabbit was designed for a very narrow use case. It became difficult for us to run and maintain that server and our client libraries. We had a huge outage, so we shifted to Kafka because of the simplicity in the architecture.

How was the initial setup?

The initial setup was simple although we had a couple of hiccups. It took around a week but that was several years ago and we haven't had any problems since. Our team carried out the deployment and we currently have a few engineers who deal with maintenance. 

What's my experience with pricing, setup cost, and licensing?

We are currently using the open-source version. 

What other advice do I have?

There is room for improvement with this solution so I rate it eight out of 10. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2116086 - PeerSpot reviewer
Senior Developer at a financial services firm with 10,001+ employees
Real User
Top 20
User-friendly solution but problems with latency
Pros and Cons
  • "Kafka's most valuable feature is its user-friendliness."
  • "There are some latency problems with Kafka."

What is our primary use case?

I primarily use Kafka in the investment banking sector to update prices and inform clients of updates.

What is most valuable?

Kafka's most valuable feature is its user-friendliness.

What needs improvement?

There are some latency problems with Kafka.

For how long have I used the solution?

I've been using Kafka for more than three years.

What other advice do I have?

I would give Kafka a rating of seven out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Enterprise Architect at Smals vzw
Real User
Top 20
Effective event sequencing, seamless system interactions, and beneficial data management
Pros and Cons
  • "There are numerous possibilities that can be explored. While it may be challenging to fully comprehend the potential advantages, one key aspect is the ability to establish a proper sequence of events rather than simply dealing with a jumbled group of occurrences. These events possess their own timestamps, even if they were not initially provided with one, and are arranged in a chronological order that allows for a clear understanding of the progression of the events."
  • "There have been some challenges with monitoring Apache Kafka, as there are currently only a few production-grade solutions available, which are all under enterprise license and therefore not easily accessible. The speaker has not had access to any of these solutions and has instead relied on tools, such as Dynatrace, which do not provide sufficient insight into the Apache Kafka system. While there are other tools available, they do not offer the same level of real-time data as enterprise solutions."

What is our primary use case?

Apache Kafka is used for more than only a messaging bus but also served as a database to store information. It functioned as a streamer, similar to ETL, to manipulate and transform events before migrating them to other systems for use. The database could also act as a cache. Apache Kafka is used as a database broker, streamer, and source of truth for multiple systems due to its ability to maintain events for at least 10 days. It provided both synchronous and asynchronous communication, making it a complex system that would be easier to understand through diagrams or sketches.

We use reactive frameworks.

How has it helped my organization?

From my experience with Apache Kafka, one of the most notable advantages is its ability to maintain a comprehensive record of historical data that includes every update, alteration, and version of information, unlike a conventional relational database. This feature allows for seamless tracking and analysis of the progression and transformation of the data over time, enabling users to easily review and analyze the history of the information.

The solution has the capability for various systems to effortlessly interact with one another without prior knowledge of their existence, current operational status, or specific configurations. By utilizing service buses and dynamic integration, data can be distributed across networks and retrieved in a way that is most suitable for each system's requirements. In addition, Apache Kafka allows for the modification of data to provide diverse clients, consumers, or observers with unique and varying data. The replication of data can produce multiple versions, and this data can be adjusted to fit various needs. With the use of probes, one can alter the behavior of the transformation process, thereby changing the way in which data is transformed and the output produced. Overall, working with Apache Kafka has brought about an array of benefits, enabling seamless system interactions and allowing for the customization and modification of data to meet individual requirements.

What is most valuable?

There are numerous possibilities that can be explored. While it may be challenging to fully comprehend the potential advantages, one key aspect is the ability to establish a proper sequence of events rather than simply dealing with a jumbled group of occurrences. These events possess their own timestamps, even if they were not initially provided with one, and are arranged in a chronological order that allows for a clear understanding of the progression of the events.

What needs improvement?

There have been some challenges with monitoring Apache Kafka, as there are currently only a few production-grade solutions available, which are all under enterprise license and therefore not easily accessible. The speaker has not had access to any of these solutions and has instead relied on tools, such as Dynatrace, which do not provide sufficient insight into the Apache Kafka system. While there are other tools available, they do not offer the same level of real-time data as enterprise solutions.

One additional area that I think could benefit from improvement is the deployment process on OpenShift. This particular deployment is quite challenging and requires the activation of certain security measures as well as integration with other systems. It's not a straightforward process and typically requires engineers who are highly skilled and have extensive experience with Apache Kafka to carry out these tasks. Therefore, I believe that there is a need for progress in this area, and some tools that can provide information, assistance, and help make the whole process easier would be greatly appreciated.

For how long have I used the solution?

I have been using Apache Kafka for approximately four years.

What do I think about the stability of the solution?

The solution is stable if you have set it up correctly.

What do I think about the scalability of the solution?

Apache Kafka is a scalable solution.

How are customer service and support?

I have not escalated any questions to technical support because Apache Kafka is an open-source system. However, Confluent and other companies sell support and enterprise solutions to make it more convenient and streamline the work. They offer tools, such as a monitoring tool with a visual interface, which provides a lot of information and buttons to press for correction or change without touching the code. Each of those buttons hypothetically could have helped the situation, but it is unclear what they do exactly, it is best to call the data center and ask. If you buy their service, you have access to all the enterprise comforts.

How was the initial setup?

Setting up Apache Kafka is, is not an easy task, especially when trying to containerize it and make it controllable. This is because Apache Kafka has its own distributed mechanism for staying alive, checking readiness, replicating, and scaling. Ensuring that it complies with Kubernetes or OpenShift Orchestrator requires careful attention, as there is a risk of two masters attempting to perform the same task and ultimately undoing each other's work.

In comparison to Kubernetes, OpenShift is a highly skilled and advanced implementation infrastructure that automatically manages and orchestrates all the steps required for an application setup. It operates at a higher level of abstraction and eliminates the need for manual operations that are required with Kubernetes. While Kubernetes can run an application with some pipeline and configuration, OpenShift takes care of everything from finding the required images to creating ports and connecting databases. Although manual changes can be made, it's not necessary as OpenShift offers a much more course-grained management approach.

What about the implementation team?

One skillful DevOps engineer can implement the solution.

What's my experience with pricing, setup cost, and licensing?

Apache Kafka is an open-source solution.

What other advice do I have?

The maintenance of Apache Kafka is crucial due to the complexity of the system with numerous microservices and systems communicating through Apache Kafka, requiring proper integration and configuration to prevent overloading and ensure a healthy cluster. The task is not easy and requires knowledge of the various adjustable parameters, as misadjusting even one of them can greatly slow down the cluster. For example, if the consumer group changes frequently, the messages must be regrouped and reassigned, causing significant delays. Therefore, configuring Apache Kafka correctly is essential to avoid high latency issues.

I would strongly suggest others give Apache Kafka a chance and explore the various advantages that it can offer, especially since it should not be perceived as a message bus or broker but rather an enterprise bus designed for data manipulation. It has the ability to transform data, store and reject it, and even maintain different versions of the same data simultaneously. Moreover, it operates on a pull mechanism rather than a push mechanism, which takes away the risk of losing data and places the responsibility for data loss on the consumer. On the other hand, it also ensures that the data is always available within the specified window and allows for easy replication of the past, which is extremely helpful in situations such as those involving a hacked bank database. With Apache Kafka, you can efficiently go back in time, obtain the required status and events, and make changes accordingly, without the need to go through each transaction separately. Thus, using this solution can make data management much more efficient and convenient.

I rate Apache Kafka an eight out of ten.

In order to improve its user-friendliness, engineer-friendliness, and DevOps-friendliness, the system must undertake various tasks, such as enhancing the overall operation and configuration, ensuring seamless integration with other systems, and adapting to security layers in a more comprehensive and generic manner. This will require significant efforts to make the system more functional, secure, and efficient.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros sharing their opinions.
Updated: November 2024
Product Categories
Streaming Analytics
Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros sharing their opinions.