Apache Kafka is used for connecting components between each other in the same application. The use is quite limited, but I was curious about its filtering capability of it.
Software Support & Development Engineer at a computer software company with 501-1,000 employees
Scalable and free to use
Pros and Cons
- "Apache Kafka is scalable. It is easy to add brokers."
- "Apache Kafka can improve by making the documentation more user-friendly. It would be beneficial if we could explain to customers in more detail how the solution operates but the documentation get highly technical quickly. For example, if they had a simple page where we can show the customers how it works without the need for the customer to have a computer science background."
What is our primary use case?
How has it helped my organization?
We implemented the notification system between our components, and we found that Apache Kafka performs well in scalability. It has improved our organization because of the scalability and the comfort of a fail-safe or disaster recovery it provides.
What needs improvement?
Apache Kafka can improve by making the documentation more user-friendly. It would be beneficial if we could explain to customers in more detail how the solution operates but the documentation get highly technical quickly. For example, if they had a simple page where we can show the customers how it works without the need for the customer to have a computer science background.
For how long have I used the solution?
I have been using Apache Kafka for approximately two years.
Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.
What do I think about the scalability of the solution?
Apache Kafka is scalable. It is easy to add brokers.
We have approximately 30 people using this solution in my organization. They use the solution daily.
Which solution did I use previously and why did I switch?
I have only used Apache Kafka.
How was the initial setup?
The initial setup of Apache Kafka took some time but after it was easy.
I rate the initial setup of Apache Kafka a three out of five.
What about the implementation team?
We set up the solution in-house.
What's my experience with pricing, setup cost, and licensing?
This is an open-source solution and is free to use.
What other advice do I have?
We have not used the solution in production. We do not have a lot of data at the moment.
I would recommend this solution to others.
I rate Apache Kafka an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Technical Lead at Interface Fintech Ltd
This very scalable solution works great and is super fast, but I would like less of a learning curve around creating brokers and topics
Pros and Cons
- "The solution is very scalable. We started with a cluster of three and then scaled it to seven."
- "I would like them to reduce the learning curve around the creation of brokers and topics. They also need to improve on the concept of the partitions."
What is our primary use case?
We use an open-source version of this solution, and we have two deployments of it. One is on-prem, and the other is in the cloud. We use the on-prem version to aggregate our logs. We use the cloud version to manage queues for financial services.
What is most valuable?
It just works and it's super fast. We were struggling with a Rabbit MQ cluster, so the Apache cluster is way easier.
What needs improvement?
I would like them to reduce the learning curve around the creation of brokers and topics. They also need to improve on the concept of the partitions.
As for features, RabbitMQ has an instant response feature where you can send a queue and get an instant response, but Kafka only has one way to send queues. If that's something they could improve on, it would be great.
For how long have I used the solution?
This is my second year working with this solution.
What do I think about the stability of the solution?
I think it's very stable. I would rate the stability as a four or five out of five.
What do I think about the scalability of the solution?
The solution is very scalable. We started with a cluster of three and then scaled it to seven. I would give the solution a five out of five for scalability. Currently, we have 20+ employees on the technical team that are using the solution.
We provide outsource services for other institutions. There is a whole set queue management form, and we have about five institutions, with three technical teams that use the same cluster.
How was the initial setup?
There was a little learning curve, but we managed it. I think it took us around six weeks to complete the deployment.
What about the implementation team?
We have a team of three people who handled the deployment in-house. They also handle the maintenance for the solution.
What other advice do I have?
We do not use customer support, but there is a lot of documentation available.
I would definitely recommend this solution to other people. I would rate it as an eight out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.
CEO & Founder at Xautomata
Allows us to ingest a lot of data and make tech decisions in real time
Pros and Cons
- "The stability is very nice. We currently manage 50 million events daily."
- "The repository isn't working very well. It's not user friendly."
What is our primary use case?
We use Apache Kafka to ingest a lot of data in real time that Apache Spark processes, and the result is used for a tech decision in real time – in the IT environment, infrastructure environment, and IOT environment, like for a manufacturing plant.
This is an open-source framework. We also sell professional services on this solution and specifically create a business application for customers.
The application is called Sherlogic. We have two kinds of customers. We have end-user customers that use the Sherlogic solution, and maybe customers don't know that there is Spark and Kafka in Sherlogic. But we have another kind of customer that uses professional services by Xautomata to create tailor-made applications in analytics and the automation process.
We use Apache Kafka for our digital cloud.
What needs improvement?
To store a large set of analytical data we are using SQL repository. This type of repository works very well but we need specific and high maintenance. The user experience is friendly.
We are looking for alternative solutions, we tried with noSQL solutions and Confluent specific features but the results were not satisfactory both in terms of performance and usability.
We are working on automated SQL repository management and maintenance tools in order to increase the democratization of our platform.
For how long have I used the solution?
We've been using this solution for a year and a half.
What do I think about the stability of the solution?
The stability is very nice. We currently manage 50 million events daily.
What do I think about the scalability of the solution?
It's scalable.
How are customer service and support?
Support is good. It's typical for an open source application. You can have all the information in a public portal. If you want specific consulting, there is a company that promotes this consulting worldwide called Conduent. Their consulting is quick and they have a lot of know-how.
How was the initial setup?
It's very complex, like Spark.
Deployment took 50 minutes for all the Kubernetes ports, Spark, Kafka, and other components based on Sherlogic. In 30 minutes, we created an environment using this program to make installation easier.
What about the implementation team?
Deployment was done in-house, but we're starting a collaboration with another company and we introduced this company to running this solution. Specifically, we started a collaboration with AWS to promote our platform in a Western marketplace. In this way, it's very easy to use our solution because it is a part of an AWS service, certificated by an engineer.
What was our ROI?
The return on investment has been having people dedicated to this solution because it's open source so it hasn't been necessary to invest in licensing or pay a fee. So, internal know-how has been the ROI.
What's my experience with pricing, setup cost, and licensing?
It's a bit cheaper compared to other Q applications.
What other advice do I have?
I would rate this solution 7 out of 10.
I would recommend this solution because the queue manager is very fast and stable.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
CTO at Estrada & Consultores
Great scalability with a high throughput and a helpful online community
Pros and Cons
- "The solution is very easy to set up."
- "While the solution scales well and easily, you need to understand your future needs and prep for the peaks."
What is our primary use case?
We primarily use the solution for upstreaming messages with different payload for our applications ranging from iOT, Food delivery and patient monitoring.
For example for one solution we have a real-time location finding, whereby a customer for the food delivery solution wants to know, where his or her order is on a map. The delivery person's mobile phone would start publishing its location to Kafka, and then Kafka processes it, and then publishes it to subscribers, or, in this case, the customer. It allows them to see information in real-time almost instantly.
How has it helped my organization?
Apache Kafka has became our main component on almost all our distributed solutions. It has helped us to delivery fast distributing messages to our customer's applications.
What is most valuable?
The solution is good for publishing transactions for commercial solutions whereby a duplicate will not affect any part of the system.
The solution is very easy to set up.
The stability is very good.
There's an online community available that can help answer questions or troubleshoot problems.
The scalability of Kafka is very good.
It provides high throughput.
What needs improvement?
Kafka can allow for duplicates, which isn't as helpful in some of our scenarios. They need to work on their duplicate management capabilities but for now developers should ensure idempotent operations for such scenarios.
While the solution scales well and easily, you need to understand your future needs and prep for the peaks.
For how long have I used the solution?
I've been using the solution for four years so far.
What do I think about the stability of the solution?
The stability is excellent. There are no bugs or glitches. It doesn't crash or freeze. It's reliable.
What do I think about the scalability of the solution?
Scaling is not really a problem with Kafka. We have used Kubernetes clusters and it is working very well. It scales up and down, almost automatically almost unnoticeable to the consumers, based upon our configuration. Kafka is just one pod inside of our cluster that scales horizontally.
We have a couple of customers that also have vertical scaling, meaning that, there's more CPU, more memory available to the Kafka pod.
How are customer service and technical support?
For Kafka, we don't actually require support from the company. We usually have people experienced in-house and sometimes we just ask in the community.
How was the initial setup?
The initial setup is easy. The majority of the tools today are really very easy to configure and setup. Docker Containers and Kubernetes, actually, have made life easier for architects as well as developers.
Nowadays, you just install the container, and then you don't have to really manage the internals at libraries, OS levels, et cetera. You just run the container. Everything is containerized.
What's my experience with pricing, setup cost, and licensing?
Apache Kafka is OpenSource, you can set it up in your own Kubernetes cluster or subscribe to Kafka providers online as a service.
What other advice do I have?
New users should understand the product capabilities. Often, people will start putting their hands in new products without knowing the capabilities and the disadvantages in specific scenarios. In our case for example, We haven't used Kafka for financial transaction processing, for which we still use IBM MQ, but It really depends upon your knowledge and experience with the product. My advice is to understand the product very well, its pros and cons and work from there.
Finally I'd rate the solution at a nine out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Owner at Binarylogicworks.com.au
Good performance and resilience, but it is complex and has a learning curve
Pros and Cons
- "The most valuable feature is the performance."
- "Kafka is complex and there is a little bit of a learning curve."
What is our primary use case?
I am a solution architect and this is one of the products that I implement for my customers.
Kafka works well when subscribes want to stream data for specific topics.
What is most valuable?
The most valuable feature is the performance.
What needs improvement?
Kafka is complex and there is a little bit of a learning curve.
For how long have I used the solution?
I have been using Apache Kafka for between one and two years.
What do I think about the stability of the solution?
Resilience-wise, Kafka is very good.
What do I think about the scalability of the solution?
Kafka is a very scalable system. You can have multiple, scalable architectures.
How are customer service and technical support?
I have not seen any problems with technical support. There is licensed support available, which is not the case with all open-source solutions. Open-source products often have issues when it comes to getting support.
Which solution did I use previously and why did I switch?
I have customers who were using IBM MQ but they have been switching to open-source.
How was the initial setup?
The initial setup was straightforward for me. However, it is not straightforward for everyone because there are some tricky things to implement. In single-mode it is a little bit easier, but when it is set up as a distributed system then it is more complex because there are a lot of things to be considered.
What's my experience with pricing, setup cost, and licensing?
Kafka is open-source and it is cheaper than any other product.
Which other solutions did I evaluate?
There is a competing open-source solution called NATS but I see that Apache Kafka is widely used in many places.
Performance-wise, Kafka is better than any of the other products.
What other advice do I have?
This is currently the product that I am recommending to customers. Some customers want an open-source solution.
There are some newer products that are coming on to the market that are even faster than Kafka but this solution is very resilient.
In the long run, I think that open-source will dominate the pace.
I would rate this solution a seven out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Building Event-centric Data processing Architectures at a tech services company with 51-200 employees
The product is scalable and provides good connectors, but the ability to connect the producers and consumers must be improved
Pros and Cons
- "The connectors provided by the solution are valuable."
- "The ability to connect the producers and consumers must be improved."
What is our primary use case?
We use the solution for analytics for streaming. We also use it for fraud detection.
What is most valuable?
The Kafka Streams library gives quite a bit of functionality. The connectors provided by the solution are valuable.
What needs improvement?
The ability to connect the producers and consumers must be improved. It's still a pain point because a lot of development goes into it.
For how long have I used the solution?
I have been using the solution for seven to eight years.
What do I think about the stability of the solution?
For what it does, the tool is very stable. It is a message broker. It receives the messages and holds them for producers and consumers. It's usually everything around Kafka that has stability problems because Kafka does exactly what it's supposed to do.
What do I think about the scalability of the solution?
Scalability is one of the main selling points of the tool. The additional nodes we add give us the additional storage capacity we need. I rate the scalability a ten out of ten. The solution is used across multiple domains in our organization. I use the product daily. It’s a continuously growing platform.
How are customer service and support?
Apache doesn't provide support. There are sites we can go to for information, but there's no support team for Apache. There are companies like Confluent and HPE that provide support for the solution.
Which solution did I use previously and why did I switch?
We also use Flink and other streaming tools. We use Apache Kafka in addition to other technologies because of the requirement and the business use cases.
How was the initial setup?
It is super easy to set up. I rate the ease of setup a ten out of ten. However, building and administration get quite difficult. It takes three months to make things production-ready.
What about the implementation team?
The deployment was done in-house. We used the tools that we have in our CI/CD pipeline. We needed three people for the deployment. The infrastructure team maintains the tool. The infrastructure team has three to ten members.
What was our ROI?
We see an ROI on the product. If we don't have a tool to buffer the amount of traffic coming in from high-traffic sites, we cannot use the data. Apache Kafka gives us a resting area where we can push as much information as we want to. It’s picked up by consumers when they need it.
It’s a huge return on investment. Otherwise, we must have a system tied to the producer waiting for the consumer to consume before we can do anything with the rest of the messages. A solution like Kafka provides us with a buffer to consume the data as we choose to.
What's my experience with pricing, setup cost, and licensing?
The price depends on who we are getting the product from. If we buy it from Confluent, we always have to try to negotiate the price. The price is always negotiable.
What other advice do I have?
Overall, I rate the product a six out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
CEO and Founder at BAssure Solutions
Plenty of adapters, beneficial for enterprises, and high availability
Pros and Cons
- "Apache Kafka has good integration capabilities and has plenty of adapters in its ecosystem if you want to build something. There are adapters for many platforms, such as Java, Azure, and Microsoft's ecosystem. Other solutions, such as Pulsar have fewer adapters available."
- "Pulsar gives more scalability to an even grouping, but Apache Kafka is used more if you want to send something in a time series-based. If this does not matter to you then Pulsar could be more customizable. Apache Kafka is nothing but a streaming system with local storage."
What is our primary use case?
We are building solutions on Apache Kafka for four customers. The customers we have are in various sectors, such as healthcare and architecture.
What is most valuable?
Apache Kafka has good integration capabilities and has plenty of adapters in its ecosystem if you want to build something. There are adapters for many platforms, such as Java, Azure, and Microsoft's ecosystem. Other solutions, such as Pulsar have fewer adapters available.
For how long have I used the solution?
I have been using Apache Kafka for three years.
What do I think about the stability of the solution?
Apache Kafka is stable.
What do I think about the scalability of the solution?
I would recommend Apache Kafka for any enterprise.
The amount of people using the solution depends on the application. However, the starting point is from 6,000 to 7,000 concurrent users.
How are customer service and support?
There is not any support, Apache Kafka is open-source.
Which solution did I use previously and why did I switch?
We have been experimenting with other solutions such as VMware RabbitMQ and Pulsar.
We are going to replace the Apache Kafka solution using Pulsar.
Pulsar gives more scalability to an even grouping, but Apache Kafka is used more if you want to send something in a time series-based. If this does not matter to you then Pulsar could be more customizable. Apache Kafka is nothing but a streaming system with local storage. Apache Kafka fits into many use cases, it's very direct, but if you want more specific use cases and you use Apache Kafka, Pulsar could be considered.
How was the initial setup?
Apache Kafka was simple to install. If you have a complicated clustered production, it takes time. However, for the development, it doesn't take more than one or two hours.
What about the implementation team?
We have approximately two to four technical managers that are deploying and supporting Apache Kafka. A technical manager is necessary.
What's my experience with pricing, setup cost, and licensing?
Apache Kafka is an open-sourced solution. There are fees if you want the support, and I would recommend it for enterprises. There are annual subscriptions available.
What other advice do I have?
Apache Kafka is one of the best open-source solutions that are available today.
I would recommend this solution to others.
I rate Apache Kafka an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Solutions Architect at a consultancy with 1,001-5,000 employees
Has the ability to write data at one velocity and have subscribing consumers read at different velocities.
Pros and Cons
- "Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it."
- "The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance."
How has it helped my organization?
Kafka has a guaranteed delivery mechanism that is very easy to set up. When starting out with minimal hardware, it can handle very large data volumes. When prototyping and creating a proof of concept, Kafka has helped to speed up the timeline from the prototype all the way to production volumes.
What is most valuable?
Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it. I find the ability to write data at one velocity and have subscribing consumers read at different velocities to be the best feature.
What needs improvement?
The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance.
What do I think about the stability of the solution?
We did not have any issues with stability.
What do I think about the scalability of the solution?
We did not have any issues with scalability.
How are customer service and technical support?
- Kafka is open source from LinkedIn and support comes from the community of users.
- You can go with Confluent, the company that was founded by the original engineers from LinkedIn.
- You can go with a cloud hosting service, like AWS EMR or Azure HDInsight.
Which solution did I use previously and why did I switch?
We used traditional message queues and file semaphores. There was a lot of overhead with asynchronous messages being put into an order and making sure nothing got dropped. It required a lot of code and maintenance.
How was the initial setup?
Since it is open source, you are on your own for setup. However, the tutorials from the Apache foundation and online sources have been an immense help.
Getting started is very easy. The complexity of very large volumes of data and appropriate sharding, however, is difficult. There are fewer resources for tuning and best practices.
What's my experience with pricing, setup cost, and licensing?
When starting to look at a distributed message system, look for a cloud solution first. It is an easier entry point than an on-premises hardware solution. A lot of the complexity has already been taken care of. Both AWS and Azure have supported Kafka clusters that can be provisioned very easily.
Which other solutions did I evaluate?
We looked at RabbitMQ and Spark Streaming.
What other advice do I have?
Be sure to define the use cases as best as possible at first.
Kafka is very good, but it is complex to support. It can handle any message size, whereas native cloud options have size limitations.
Be sure to understand what messages will be sent and how many discrete topics will be needed.
Be aware that you must code both producers and consumers.
The bulk of the work is with the consumer.
The Apache stack for Kafka is very open source. There are essentially no tools other than command line options to monitor brokers and topic health. So there are 3rd party tools that will help with that, some free, some paid – but it requires that you install agents on the servers hosting Kafka and open up ports for netbeans on the scripts that start up the Kafka services. Additionally, you also have to monitor zookeeper – which is very memory intensive. Cloud offerings that provide the whole modern data architecture stack – like AWS EMR and Azure HDInsight as well as Hortonworks and Cloudera provide a console GUI as part of each of their offerings. Also Confluent, a company founded by the Linked-In engineers that designed Kafka, also have a paid enterprise offering that has much better tools for maintain the kafka cluster. But apache Kafka with the community – you are on your own.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Product Categories
Streaming AnalyticsPopular Comparisons
PubSub+ Platform
Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which ETL tool would you recommend to populate data from OLTP to OLAP?
- What are the differences between Apache Kafka and IBM MQ?
- How do you select the right cloud ETL tool?
- What is the best streaming analytics tool?
- What are the benefits of streaming analytics tools?
- What features do you look for in a streaming analytics tool?