Try our new research platform with insights from 80,000+ expert users
reviewer1421481 - PeerSpot reviewer
Solution Architect at a manufacturing company with 10,001+ employees
Real User
Good performance when a high throughput is required, but they need to implement a portal
Pros and Cons
  • "The processing power of Apache Kafka is good when you have requirements for high throughput and a large number of consumers."
  • "They need to have a proper portal to do everything because, at this moment, Kafka is lagging in this regard."

What is our primary use case?

I am a solution architect and I used Apache Kafka in this role.

What is most valuable?

The processing power of Apache Kafka is good when you have requirements for high throughput and a large number of consumers. 

What needs improvement?

They need to have a proper portal to do everything because, at this moment, Kafka is lagging in this regard. It could be used to do the preprocessing or the configurations, instead of directly doing it on the queues or the topics. If you look at Solace, for example, they have come up with a portal where you don't need to touch these activities. You don't need to access the platform beyond the portal.

For how long have I used the solution?

I have used Apache Kafka for between one and one and a half years.

Buyer's Guide
Apache Kafka
February 2025
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
838,713 professionals have used our research since 2012.

What do I think about the stability of the solution?

Apache Kafka is stable.

What do I think about the scalability of the solution?

This is certainly a scalable product. There are currently 30 or more people using it but we expect to scale beyond this. It is going to be an enterprise tool within the company.

How are customer service and support?

I am not directly interacting with the service people at this moment. It is limited for now because we are still exploring and effecting our architecture and design, and deciding how to align it with our existing strategy. There is not much progress in this regard and it will take more time.

Which solution did I use previously and why did I switch?

Prior to working with Apache Kafka, there was no messaging queue system. For many projects, they were using the Azure Event Hub, but it was not serving the purpose. So, we started moving towards Kafka, and that's why we have procured Confluent Kafka.

Several months ago, I stopped working on Apache Kafka. I am now working on Confluent Kafka. It was not my decision to switch solutions.

My current organization has chosen Confluent Kafka for various reasons. One is that we have a large number of streaming requirements, and Confluent Kafka has one more layer on top of Apache Kafka to do this transformation and connecting with other multiple lane systems.

There are out-of-the-box features along with the KSQL features. For example, things like fetching the events are kind of query-based. So, that seems to be a good feature for our requirements. That is why we ultimately procured Confluent Kafka.

For some time, I have also worked with Solace and it has an advantage. Given that my core strength is integration, I work with integration platforms such as MuleSoft, Azure functions, then TIBCO. Based on our requirements, I found that the event-driven APA implementation with Solace was easier.

Solace also has a top-notch solution for portal management and you register your producers, consumers, and preprocessing logic. All of these things are pretty easy to do. This is an area where Kafka could use some enhancement.

How was the initial setup?

I don't think that the initial setup was a complex process.

Which other solutions did I evaluate?

MQ messaging systems are not my core strength but for any integration platform where we have a large number of APIs and events, to integrate with an IoT platform, for example, I found Kafka is better than ActiveMQ.

I'm not getting into in MQTT or other things but comparatively, when you compare ActiveMQ and Kafka, Kafka has done better.

What other advice do I have?

I think that many people are using Apache Kafka just as a publishing and subscription model, but I feel that Kafka is better than that. Furthermore, Confluent Kafka is even more than that.

Confluent Kafka is offering features that are equal to those of a data lake. You can do lots with data, and huge data can be persisted. However, many people are not using that feature. Rather than make use of persistence logic, they are pushing the messages and consuming them. Maybe if people were using it for persistence, they would see the impact or real power of Kafka.

I would rate this solution a seven out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Roger Sabourin - PeerSpot reviewer
Roger SabourinSenior Manager, Analyst Relations at a tech vendor with 201-500 employees
Real User

You're in luck, Solace's PubSub+ Event Portal for Kafka does all the things you're looking for, specifically for your Kafka environments, be they open source Kafka, Confluent or Amazon MSK.  Check it out, or request a free trial at https://solace.com/products/po...

it_user660627 - PeerSpot reviewer
Senior Software Engineering Consultant at a tech services company with 51-200 employees
Consultant
It offers throughput with built-in fault-tolerance and replication.
Pros and Cons
  • "Kafka, as compared with other messaging system options, is great for large scale message processing applications. It offers high throughput with built-in fault-tolerance and replication."
  • "Kafka requires non-trivial expertise with DevOps to deploy in production at scale. The organization needs to understand ZooKeeper and Kafka and should consider using additional tools, such as MirrorMaker, so that the organization can survive an availability zone or a region going down."

How has it helped my organization?

I used Kafka with a client to decouple applications with different availability profiles. Before using a messaging-based architecture with Kafka as the messaging system, the client used a coordinator application to fire off various posts to as many as eight other applications. With an application that's impacting at least a customer a second in airports, where the customers demand that the system always works, there were issues with ensuring high availability.

A typical way to calculate system availability is: Availability = Uptime/(Uptime + Downtime). Hence, where there are two applications involved with a 99% availability, the total system availability degrades quickly: 99% * 99% = 98.01%.

With eight applications, total availability caused issues. However, only two systems needed to provide real-time responses, while other systems were for payment processing, CRM, promotions, etc. It was OK if those systems were not up to date in real time.

Kafka allowed the client to have temporal decoupling for writes, i.e., the flaky third-party CRM system did not need to be available at the moment for us to respond to a user with a successful response. The availability concerns shifted to Kafka, which is a better trade off because it's built for this.

Another benefit, though not required, was the addition of logical decoupling between applications. Additional consumers could be built to overlay concerns of analytics, but the systems responsible for creating the entities on a given topic did not need to be aware of the analytics applications. This simplifies the interaction between applications and concerns of an organization.

Another benefit of this architecture is that testing is simplified. A given application needs to be tested to obey a contract of reading a message and producing another message. A Kafka topic acts as the boundary for an integration test.

What is most valuable?

Kafka, as compared with other messaging system options, is great for large scale message processing applications. It offers high throughput with built-in fault-tolerance and replication.

Messaging systems in general allow for logical and temporal decoupling between applications. Given Kafka's high availability, it's a great option to use if applications require availability, but not real-time processing.

If a downstream system is offline, messages can queue up and process when possible, but the user may not necessarily need to be aware of any issues.

A messaging-based architecture becomes important as a set of micro-services need to scale with high availability. Kafka is a great choice for messaging with such architecture.

What needs improvement?

Kafka requires non-trivial expertise with DevOps to deploy in production at scale. The organization needs to understand ZooKeeper and Kafka and should consider using additional tools, such as MirrorMaker, so that the organization can survive an availability zone or a region going down.

Shifting availability concerns to Kafka means that it cannot go down. It's important to understand the partitioning model and replication needs before relying on it for critical business functions. I'd suggest using it with a feature toggle for a non-critical path in production and learning from failure before relying on it.

While Kafka is built to scale, that does not mean that applications can start as many consumers or producers without consideration for how Kafka brokers will perform. Considerations about scaling out brokers need to occur before publishing millions of messages.

What do I think about the stability of the solution?

Generally, there were no stability issues. However, there was one scare in production when a consumer rebalance took 30 minutes and messages were not being processed during that time.

What do I think about the scalability of the solution?

We have not yet had scalability issues!

How are customer service and technical support?

There are specialized consulting companies in this space and there are online resources to read. That may help companies get past hurdles.

Which solution did I use previously and why did I switch?

No, we did you use a previous messaging system.

How was the initial setup?

The setup was complex. One must consider setting up ZooKeeper, Kafka, multi-zone/region availability, as well as typical associated functions for running it all in production. This includes monitoring, message schema changes (consider Avro), encrypting messages if it's a concern, potentially authorization for different topics depending up on the sensitivity of data.

If an organization uses Kafka as the first messaging system, then the approach for application design must also shift significantly.

What's my experience with pricing, setup cost, and licensing?

It is open source software.

Which other solutions did I evaluate?

The client evaluated alternatives before I arrived, but I was not there during the evaluation so I cannot comment.

What other advice do I have?

Consider using a managed Kafka service, such as from Heroku.

If messaging is not a central component of the business and vendor lock-in is less of a concern, consider using something like Amazon's Kinesis. This can more rapidly provide the benefits of a messaging service without the pain of understanding it deeply, setting it up, and managing it.

It's important to use a lean approach to understand how it will break in production.

Implement a non-critical transaction with it.

Perhaps use a feature toggle within a facade and implement the behavior with the old approach and with Kafka to reduce risk.

Add it to one or two applications and monitor how it goes.

Figure out security, monitoring, scaling, schema migration, etc., before using it as a critical component in an application.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
February 2025
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: February 2025.
838,713 professionals have used our research since 2012.
PeerSpot user
Technical Lead at Interface Fintech Ltd
Real User
This very scalable solution works great and is super fast, but I would like less of a learning curve around creating brokers and topics
Pros and Cons
  • "The solution is very scalable. We started with a cluster of three and then scaled it to seven."
  • "I would like them to reduce the learning curve around the creation of brokers and topics. They also need to improve on the concept of the partitions."

What is our primary use case?

We use an open-source version of this solution, and we have two deployments of it. One is on-prem, and the other is in the cloud. We use the on-prem version to aggregate our logs. We use the cloud version to manage queues for financial services. 

What is most valuable?

It just works and it's super fast. We were struggling with a Rabbit MQ cluster, so the Apache cluster is way easier.

What needs improvement?

I would like them to reduce the learning curve around the creation of brokers and topics. They also need to improve on the concept of the partitions. 

As for features, RabbitMQ has an instant response feature where you can send a queue and get an instant response, but Kafka only has one way to send queues. If that's something they could improve on, it would be great.

For how long have I used the solution?

This is my second year working with this solution. 

What do I think about the stability of the solution?

I think it's very stable. I would rate the stability as a four or five out of five. 

What do I think about the scalability of the solution?

The solution is very scalable. We started with a cluster of three and then scaled it to seven. I would give the solution a five out of five for scalability. Currently, we have 20+ employees on the technical team that are using the solution. 

We provide outsource services for other institutions. There is a whole set queue management form, and we have about five institutions, with three technical teams that use the same cluster.

How was the initial setup?

There was a little learning curve, but we managed it. I think it took us around six weeks to complete the deployment. 

What about the implementation team?

We have a team of three people who handled the deployment in-house. They also handle the maintenance for the solution. 

What other advice do I have?

We do not use customer support, but there is a lot of documentation available.

I would definitely recommend this solution to other people. I would rate it as an eight out of ten. 

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
CTO at Estrada & Consultores
Real User
Great scalability with a high throughput and a helpful online community
Pros and Cons
  • "The solution is very easy to set up."
  • "While the solution scales well and easily, you need to understand your future needs and prep for the peaks."

What is our primary use case?

We primarily use the solution for upstreaming messages with different payload for our applications ranging from iOT, Food delivery and patient monitoring. 

For example for one solution we have a real-time location finding, whereby a customer for the food delivery solution wants to know, where his or her order is on a map. The delivery person's mobile phone would start publishing its location to Kafka, and then Kafka processes it, and then publishes it to subscribers, or, in this case, the customer. It allows them to see information in real-time almost instantly.

How has it helped my organization?

Apache Kafka has became our main component on almost all our distributed solutions. It has helped us to delivery fast distributing messages to our customer's applications.

What is most valuable?

The solution is good for publishing transactions for commercial solutions whereby a duplicate will not affect any part of the system.

The solution is very easy to set up.

The stability is very good.

There's an online community available that can help answer questions or troubleshoot problems. 

The scalability of Kafka is very good.

It provides high throughput.

What needs improvement?

Kafka can allow for duplicates, which isn't as helpful in some of our scenarios. They need to work on their duplicate management capabilities but for now developers should ensure idempotent operations for such scenarios.

While the solution scales well and easily, you need to understand your future needs and prep for the peaks. 

For how long have I used the solution?

I've been using the solution for four years so far.

What do I think about the stability of the solution?

The stability is excellent. There are no bugs or glitches. It doesn't crash or freeze. It's reliable. 

What do I think about the scalability of the solution?

Scaling is not really a problem with Kafka. We have used Kubernetes clusters and it is working very well. It scales up and down, almost automatically almost unnoticeable to the consumers, based upon our configuration. Kafka is just one pod inside of our cluster that scales horizontally.

We have a couple of customers that also have vertical scaling, meaning that, there's more CPU, more memory available to the Kafka pod.

How are customer service and technical support?

For Kafka, we don't actually require support from the company. We usually have people experienced in-house and sometimes we just ask in the community. 

How was the initial setup?

The initial setup is easy. The majority of the tools today are really very easy to configure and setup. Docker Containers and Kubernetes, actually, have made life easier for architects as well as developers.

Nowadays, you just install the container, and then you don't have to really manage the internals at libraries, OS levels, et cetera. You just run the container. Everything is containerized.

What's my experience with pricing, setup cost, and licensing?

Apache Kafka is OpenSource, you can set it up in your own Kubernetes cluster or subscribe to Kafka providers online as a service.

What other advice do I have?

New users should understand the product capabilities. Often, people will start putting their hands in new products without knowing the capabilities and the disadvantages in specific scenarios. In our case for example, We haven't used Kafka for financial transaction processing, for which we still use IBM MQ, but It really depends upon your knowledge and experience with the product. My advice is to understand the product very well, its pros and cons and work from there.

Finally I'd rate the solution at a nine out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Owner at Binarylogicworks.com.au
Real User
Good performance and resilience, but it is complex and has a learning curve
Pros and Cons
  • "The most valuable feature is the performance."
  • "Kafka is complex and there is a little bit of a learning curve."

What is our primary use case?

I am a solution architect and this is one of the products that I implement for my customers.

Kafka works well when subscribes want to stream data for specific topics.

What is most valuable?

The most valuable feature is the performance.

What needs improvement?

Kafka is complex and there is a little bit of a learning curve.

For how long have I used the solution?

I have been using Apache Kafka for between one and two years.

What do I think about the stability of the solution?

Resilience-wise, Kafka is very good.

What do I think about the scalability of the solution?

Kafka is a very scalable system. You can have multiple, scalable architectures.

How are customer service and technical support?

I have not seen any problems with technical support. There is licensed support available, which is not the case with all open-source solutions. Open-source products often have issues when it comes to getting support.

Which solution did I use previously and why did I switch?

I have customers who were using IBM MQ but they have been switching to open-source.

How was the initial setup?

The initial setup was straightforward for me. However, it is not straightforward for everyone because there are some tricky things to implement. In single-mode it is a little bit easier, but when it is set up as a distributed system then it is more complex because there are a lot of things to be considered.

What's my experience with pricing, setup cost, and licensing?

Kafka is open-source and it is cheaper than any other product.

Which other solutions did I evaluate?

There is a competing open-source solution called NATS but I see that Apache Kafka is widely used in many places.

Performance-wise, Kafka is better than any of the other products.

What other advice do I have?

This is currently the product that I am recommending to customers. Some customers want an open-source solution.

There are some newer products that are coming on to the market that are even faster than Kafka but this solution is very resilient.

In the long run, I think that open-source will dominate the pace.

I would rate this solution a seven out of ten.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user653562 - PeerSpot reviewer
Solutions Architect at a consultancy with 1,001-5,000 employees
Consultant
Has the ability to write data at one velocity and have subscribing consumers read at different velocities.
Pros and Cons
  • "Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it."
  • "The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance."

How has it helped my organization?

Kafka has a guaranteed delivery mechanism that is very easy to set up. When starting out with minimal hardware, it can handle very large data volumes. When prototyping and creating a proof of concept, Kafka has helped to speed up the timeline from the prototype all the way to production volumes.

What is most valuable?

Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it. I find the ability to write data at one velocity and have subscribing consumers read at different velocities to be the best feature.

What needs improvement?

The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance.

What do I think about the stability of the solution?

We did not have any issues with stability.

What do I think about the scalability of the solution?

We did not have any issues with scalability.

How are customer service and technical support?

  • Kafka is open source from LinkedIn and support comes from the community of users.
  • You can go with Confluent, the company that was founded by the original engineers from LinkedIn.
  • You can go with a cloud hosting service, like AWS EMR or Azure HDInsight.


    Which solution did I use previously and why did I switch?

    We used traditional message queues and file semaphores. There was a lot of overhead with asynchronous messages being put into an order and making sure nothing got dropped. It required a lot of code and maintenance.

    How was the initial setup?

    Since it is open source, you are on your own for setup. However, the tutorials from the Apache foundation and online sources have been an immense help.

    Getting started is very easy. The complexity of very large volumes of data and appropriate sharding, however, is difficult. There are fewer resources for tuning and best practices.

    What's my experience with pricing, setup cost, and licensing?

    When starting to look at a distributed message system, look for a cloud solution first. It is an easier entry point than an on-premises hardware solution. A lot of the complexity has already been taken care of. Both AWS and Azure have supported Kafka clusters that can be provisioned very easily.

    Which other solutions did I evaluate?

    We looked at RabbitMQ and Spark Streaming.

    What other advice do I have?

    Be sure to define the use cases as best as possible at first.

    Kafka is very good, but it is complex to support. It can handle any message size, whereas native cloud options have size limitations.

    Be sure to understand what messages will be sent and how many discrete topics will be needed.

    Be aware that you must code both producers and consumers.

    The bulk of the work is with the consumer.

    The Apache stack for Kafka is very open source. There are essentially no tools other than command line options to monitor brokers and topic health. So there are 3rd party tools that will help with that, some free, some paid – but it requires that you install agents on the servers hosting Kafka and open up ports for netbeans on the scripts that start up the Kafka services. Additionally, you also have to monitor zookeeper – which is very memory intensive. Cloud offerings that provide the whole modern data architecture stack – like AWS EMR and Azure HDInsight as well as Hortonworks and Cloudera provide a console GUI as part of each of their offerings. Also Confluent, a company founded by the Linked-In engineers that designed Kafka, also have a paid enterprise offering that has much better tools for maintain the kafka cluster. But apache Kafka with the community – you are on your own.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    ShoaibKhan - PeerSpot reviewer
    Technical Specialist at APIZone
    Real User
    System for email and other small devices that allows for a continuous relay of transactions
    Pros and Cons
    • "This is a system for email and other small devices. There has been a relay of transactions continuously over the last two years it has been in production."
    • "The management overhead is more compared to the messaging system. There are challenges here and there. Like for long usage, it requires restarts and nodes from time to time."

    What is our primary use case?

    This is a system for email and other small devices. There has been a relay of transactions continuously over the last two years it has been in production.

    What is most valuable?

    Besides better stability and scalability, there are no additional functionalities I'd like to see. Kafka is good at what it does.

    What needs improvement?

    The management overhead is more compared to the messaging system. There are challenges here and there. Like for long usage, it requires restarts and nodes from time to time.

    For how long have I used the solution?

    We started using this solution two years ago.

    What do I think about the stability of the solution?

    There are issues with stability. It's not 100% stable like ActiveMQ, but it's maybe 98% stable.

    What do I think about the scalability of the solution?

    With the containerized version we have used, we have faced challenges with the scalability.

    How was the initial setup?

    Initial setup was not easy. It requires intermediate skills.

    What's my experience with pricing, setup cost, and licensing?

    This is an open-source version.

    What other advice do I have?

    I would rate this solution 8 out of 10.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Technical Consultant at KPMG
    Real User
    It eases our current data flow and framework
    Pros and Cons
    • "It eases our current data flow and framework."
    • "Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc."

    What is our primary use case?

    It's convenient and flexible for almost all kinds of data producers. We integrated it with Kafka Streams, which can perform some easy data processing, like summary, count, group, etc

    How has it helped my organization?

    It eases our current data flow and framework, which digests all types of sources regardless of it being structured or not.

    What is most valuable?

    • High availability
    • High throughput

    With such a large digest, I was genuinely impressed at the process being almost real-time.

    What needs improvement?

    Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc.

    For how long have I used the solution?

    Less than one year.
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user