Try our new research platform with insights from 80,000+ expert users
Architect at Agence Française de Développement
Real User
Top 5
With phenomenal scalability, the setup phase needs to be made easier
Pros and Cons
  • "It is a stable solution...A lot of my experience indicates that Apache Kafka is scalable."
  • "The solution's initial setup process was complex."

What is our primary use case?

We use Kafka for Elastic Stack and Kafka SCRAM login.

I have many users of Apache Kafka. It's like a subject to study in enterprises. However, we have not decided if the systems should generalize Apache Kafka for every application and every IT system.

What is most valuable?

We use Kafka for mapping and ThoughtSpot data from one IT system source to the destination. We also prefer it to exchange data from our internal IT systems.


What needs improvement?

Kafka is a new method we opted to apply to our need for data exchange. Also, we use the solution's integration capabilities.

Irovement-wise, I would like the solution to have more integration capabilities. Also, the solution's setup, which is currently complex, should be made easier.


For how long have I used the solution?

I have experience with Apache Kafka.

Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.

What do I think about the stability of the solution?

It is a stable solution.

What do I think about the scalability of the solution?

A lot of my experience indicates that Apache Kafka is scalable. We can have ten or even fifty hundred users on the solution. So, it's possible because we are a big enterprise.

How are customer service and support?

I have experience with Apache Kafka's technical support.


How was the initial setup?

The solution's initial setup process was complex. The deployment process took three or four years.

Right now, I can't deliver the planning process required for deployment.

For deployment and maintenance, we have a manager and an operational person. However, I can't give an exact count of the people required for deployment and maintenance.

What other advice do I have?

To be able to recommend Kafka to others, especially considering every context, we will have to set a benchmark and compare Kafka with other tools.

I rate the overall solution a seven out of ten.


Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Real User
Top 5Leaderboard
Excellent for heavy-duty data classification; should do away with configuration problems
Pros and Cons
  • "Kafka allows you to handle huge amounts of data and classify it into different categories. If you have huge amounts of data, Kafka is a very good solution for data classification."
  • "Kafka is a nightmare to administer."

What is our primary use case?

My primary use case for Apache Kafka is replacing ETL and doing data transformations.

How has it helped my organization?

Kafka allows you to handle huge amounts of data and classify it into different categories. If you have huge amounts of data, Kafka is a very good solution for data classification. When you need to route it in different directions, you have to take a look at the messages that you get, interfile them, and then send them to the correct place. Kafka is a good product to use in the backend.

What is most valuable?

The feature I find most valuable is the classification feature. Kafka enables you to tag content with a category.

What needs improvement?

Kafka contains two components. The component that does the synchronization between the rest of the components, that's an older version of the software and it causes all kinds of configuration problems. The Confluent, which is the company that sells a commercial version of Kafka is getting away from that component precisely because of that. Kafka is a nightmare to administer.

In the next release, I would like to see that one troublesome component that causes configuration issues removed.

For how long have I used the solution?

I have been using Apache Kafka for a couple of years.

What do I think about the stability of the solution?

The stability of this solution depends on whether it is properly configured. Having said that, Kafka is incredibly complex to configure, set up, administer, and maintain.

What do I think about the scalability of the solution?

My opinion is that Apache Kafka is a scalable solution. In our organization, there are hundreds of thousands of users using Kafka.

How was the initial setup?

The initial setup was extremely complex. In our case, it took a team of 12 two months to deploy.

What about the implementation team?

These systems were installed by somebody else, not me.

What's my experience with pricing, setup cost, and licensing?

I would advise others to schedule a month or two to just set it up and have it up and running.

Which other solutions did I evaluate?

There are other options. For example, Databricks is a Kafka alternative. We decided to go with Kafka because one of our clients already chose Kafka.

While evaluating, we found out Databricks is more expensive, for the level of activity that Kafka handles (in this case, millions of requests per day). Databricks could do it, but it would be overly expensive.

I would rate Apache Kafka's pricing a seven out of ten, with one being cheap and 10 being very expensive.

What other advice do I have?

Since it has become so popular, large enterprises especially want to do it. For smaller enterprises, Kafka would probably be too expensive because they would have to hire people to maintain it.

I would rate the Apache Kafka solution a seven out of ten.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
November 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
817,354 professionals have used our research since 2012.
Nor EL MALKI - PeerSpot reviewer
Project Manager at Leyton & Associés, SAS
Real User
Simple to scale, high performance, and low maintenance
Pros and Cons
  • "The most valuable feature of Apache Kafka is the clustering which is very easy to scale and we have multiple servers all over our platforms. It has been useful for stability and performance."
  • "Apache Kafka can improve by providing a UI for monitoring. There are third-party tools that can do it, but it would be nice if it was already embedded within Apache Kafka."

What is our primary use case?

We have a scalable architecture where we need multiple workers to handle some processing. To make it possible, the backend catches the request and puts it in a common medium, which is the queue of Apache Kafka. The workers then can share and process it.

What is most valuable?

The most valuable feature of Apache Kafka is the clustering which is very easy to scale and we have multiple servers all over our platforms. It has been useful for stability and performance.

What needs improvement?

Apache Kafka can improve by providing a UI for monitoring. There are third-party tools that can do it, but it would be nice if it was already embedded within Apache Kafka.

For how long have I used the solution?

I have been using Apache Kafka for approximately two years.

What do I think about the stability of the solution?

Apache Kafka is stable. We have not had any issues.

What do I think about the scalability of the solution?

the scalability of Apache Kafka is good. We have parts of the information we use in different geographical sites and it doesn't pose any problem.

How are customer service and support?

I have not used technical support.

Which solution did I use previously and why did I switch?

I previously used RabbitMQ. We switched because Apache Kafka was more stable and had better performance.

How was the initial setup?

The initial setup of Apache Kafka was easy because it is Dockerized. However, if you were to install it yourself it would be difficult. Having it Dockerized makes it worth it. 

The first deployment took approximately two hours. The updates of the solution can be done in a matter of minutes.

What about the implementation team?

Our DevOps team in our IT department did the deployment of the solution. It was mostly virtual work. The maintenance of the solution does not take a lot of time.

What's my experience with pricing, setup cost, and licensing?

We are using the free version of Apache Kafka.

What other advice do I have?

We had a good experience with the solutions, the maintainability and scalability are good. I would recommend the solution to others.

I rate Apache Kafka a nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Teodor Muraru - PeerSpot reviewer
Developer at Emag
Real User
Top 5
Reliable solution for processing broker messages from many clients
Pros and Cons
  • "The most valuable feature is the messaging function and reliability."
  • "Something that could be improved is having an interface to monitor the consuming rate."

What is our primary use case?

I have a lot of messages, and we need to process those messages from many clients. Each client takes those messages and processes them.

I'm using the brokerage partner. I'm not storing or maintaining the application on servers. I'm just a client for the Apache Kafka server.

The solution is deployed on-prem.

How has it helped my organization?

Apache Kafka has improved our organization because it's more reliable than Rabbit. That's the whole point for us.

What is most valuable?

The most valuable feature is the messaging function and reliability.

What needs improvement?

Something that could be improved is having an interface to monitor the consuming rate. We use something, but I'm not sure if it's from Apache Kafka, or if it's a borrowed third-party solution. So, the interface for monitoring the processes is an additional feature that could be added.

For how long have I used the solution?

I have been using this solution for two years.

What do I think about the stability of the solution?

The solution is pretty stable compared to Rabbit or other brokers. 

What do I think about the scalability of the solution?

The solution is scalable. We have about 10 departments that use Kafka in various forms. Each department might have 5 or 10 people.

We use the solution all the time. We have consumers that consume messages that come every day because we have clients and customers for the main website. All of those messages go to KAF clients. Our backend departments consume messages from the actions of the final customers.

Which solution did I use previously and why did I switch?

We used Rabbit and we switched to Kafka because it seemed like an upgrade in ability, reliability, and in the consuming process of broker messages.

How was the initial setup?

Implementations took half a year for everyone to learn the solution. It was quite lengthy.

What other advice do I have?

I would rate this solution 9 out of 10.

My advice is to take some time in investigating how to implement the solution.

We used to require about half a year to implement in our organization. Someone who needs to implement Kafka has to be prepared for a quite lengthy process. Don't expect implementation to be completed in a week. It's a little bit longer because it's complex.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Senior Technology Architect at a tech services company with 10,001+ employees
Real User
A resilient solution for metrics collection and monitoring
Pros and Cons
  • "Resiliency is great and also the fact that it handles different data formats."
  • "Some vendors don't offer extra features for monitoring."

What is our primary use case?

We use Apache Kafka for financial purposes. Every time one of our subscribed customers is due for an insurance payment, Apache Kafka sends an automated notification to the customer to let them know that their bill is due.

What is most valuable?

Resiliency is great and also the fact that it handles different data formats. There is one data format that's universal across multiple application domains — Avro. It's pretty universal compared to JSON, XML, SQI, and other formats.

What needs improvement?

Some vendors don't offer extra features for monitoring. Some come with Linux for default monitoring. Monitoring is very important. If something is not working properly, then our subscribers won't receive a notification. You then have to trace it back to Kafka and find the glitch or the messaging sequence that hasn't been racked up correctly.

It should support Avro — which handles different data formats — as a default data format. It would be much more flexible if it did.

For how long have I used the solution?

I have been using Apache Kafka for three years.

What do I think about the stability of the solution?

It seems to be quite stable.

What do I think about the scalability of the solution?

Apache Kafka is Scalable. You can actually launch a server node or a broker. Three nodes and Zookeeper (the Kafka server management system) is optimal. If one of them goes down you can automatically launch another one. You can go three servers or brokers back — there's a repetition on each Kafka broker.

How are customer service and technical support?

Apache Kafka is open-source. They don't offer technical support.

What other advice do I have?

On a scale from one to ten, I would give Apache Kafka a rating of eight.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user590451 - PeerSpot reviewer
Lead Engineer at a retailer with 10,001+ employees
Real User
We use the product for high-scale distributed messaging. Multiple consumers can sync with it and fetch messages.

What is most valuable?

We use the product for high-scale distributed messaging. The processing capability of the product is enormous. Being a distributed platform, multiple consumers can sync with it and fetch messages.

Another great feature is the consumer offset log which tells you where the consumer left and where he needs to start again. Consumers aren’t required to code and put extra effort to maintain the offset.

How has it helped my organization?

We were using another commercial messaging engine, which was not scalable unless you paid more. Each hub that we provisioned was expensive. This solution is open source, which is much easier to use and doesn’t cost us anything.

What needs improvement?

This product guarantees at-least-once delivery. We have asked JIRA to provide features such as at-most-once delivery to remove duplicate message consumption.

What do I think about the stability of the solution?

We haven’t faced any issues so far. Some of the clusters churn millions of records per seconds with ease.

What do I think about the scalability of the solution?

We have clustered environments and we haven’t seen any scalability issues. We can provision a new node in as little as 45 minutes.

How are customer service and technical support?

It is open source, so support is in our own hands. The only option is to make a new feature request through JIRA. When multiple people in the community make a request for similar feature, it gets priority.

Which solution did I use previously and why did I switch?

We switched from a previous solution mainly to reduce costs and to have a more scalable solution.

How was the initial setup?

The initial setup was a bit complex in terms of how to manage it across data centers. But once it was setup, we never faced issues.

Which other solutions did I evaluate?

We evaluated multiple options, such as ActiveMQ and RabbitMQ. We leaned towards this solution.

What other advice do I have?

I would advise others to start with non-SSL implementations and try to do PoCs. Afterwards, they should move towards more secure features.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Arucy Lionel - PeerSpot reviewer
Co-Founder at Afriziki
Real User
Top 5Leaderboard
Offers real-time processing workloads and highly scalability
Pros and Cons
  • "I use it for real-time processing workloads. So, in some instances, it's like IoT data. We need to put it into a data lake."
  • "For the original Kafka, there is room for improvement in terms of latency spikes and resource consumption. It consumes a lot of memory."

What is our primary use case?

Lots of real-time processing and high-velocity data are the use cases.

What is most valuable?

I'm happy with the scalability and the ability to kind of replay the topics if you wish. So, it can give you that flexibility.

What needs improvement?

For the original Kafka, there is room for improvement in terms of latency spikes and resource consumption. It consumes a lot of memory.

Resource consumption. It consumes a lot of memory.

For how long have I used the solution?

I have been using it since 2019. 

What do I think about the stability of the solution?

I would rate the stability a seven out of ten. There are issues due to latency spikes and resource consumption. It varies quite a bit. It's not very stable. It is a powerful tool; it can work, but it can be problematic sometimes. And that's why I switched to Redpanda.

What do I think about the scalability of the solution?

I would rate the scalability a nine out of ten. One of our clients is an online casino; they have over two million end users. 

Which solution did I use previously and why did I switch?

I used RabbitMQ. I switched to Kafka because it is just capable of handling a lot more messages.

And that was because the original Kafka had some performance issues, some latency spikes, and things like that.

How was the initial setup?

The initial setup is easy because they provide documents. So, the documentation makes it easy to set up.

The deployment takes a few hours to set up a production environment and configure it in the cluster. It's pretty straightforward and pretty fast.

What about the implementation team?

I figured it out on my own.

What was our ROI?

There is an ROI. 

What's my experience with pricing, setup cost, and licensing?

If you use Confluent Cloud, it's expensive because it needs updates available in the platform, like AWS. But you only pay for what you use. So it's quite affordable considering the value it provides.

It is affordable for me. 

What other advice do I have?

Overall, I would rate the solution an eight out of ten. I would advise integrating Kafka with Redpanda. It's easier to work with for most people.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Lead Architect at a financial services firm with 1,001-5,000 employees
Real User
Good partition tolerance, message reliability, and API integration
Pros and Cons
  • "The main advantage is increased reliability, particularly with regard to data and the speed with which messages are published to the other side."
  • "One of the things I am mostly looking for is that once the message is picked up from Kafka, it should not be visible or able to be consumed by other applications, or something along those lines. That feature is not present, but it is not a limitation or anything of the sort; rather, it is a desirable feature. The next release should include a feature that prevents messages from being consumed by other applications once they are picked up by Kafka."

What is our primary use case?

We use it extensively in our data pushing, for analytics and all of this type of data that is pushed, rather than on a real-time and payment basis. However, we are using it for offline messages, pushing it for processing, and for heavy, heavy usage, rather than extensively using it for financial data.

What is most valuable?

The main advantage is increased reliability, particularly with regard to data and the speed with which messages are published to the other side. 

The connectivity from the application is straightforward, as is the API integration.

These are some of the most valuable features of this solution. 

In terms of partition tolerance, message reliability is also present, which is a very good feature from the customer's perspective.

What needs improvement?

The area for improvement in Kafka is difficult to say because it's a solid product that works well in its intended applications. And, we are looking for something that can be used as part of financial implementations, because we don't want too many messages to be delivered to the other side, which is one of the areas I am looking at as well.

One of the things I am mostly looking for is that once the message is picked up from Kafka, it should not be visible or able to be consumed by other applications, or something along those lines. That feature is not present, but it is not a limitation or anything of the sort; rather, it is a desirable feature.

The next release should include a feature that prevents messages from being consumed by other applications once they are picked up by Kafka.

Then there is message dependability because a message is of no use if cannot be consumed. Alternatively, if the message is consumed but not committed, it should not be recorded in the Kafka queues. It should be because that is one of the features that is existing in MQs consistently provide: if the message is not committed, it will be committed back to the queues.

I have not seen that in Kafka.

For how long have I used the solution?

We have been using Apache Kafka for approximately three years in the organization.

I believe we are working with version 10. Confluent Kafka is what we are using.

What do I think about the stability of the solution?

It's a stable solution. Once completed, it is a very stable solution.

What do I think about the scalability of the solution?

The scalability is very good. It is scalable horizontally rather than vertically. 

It can scale up to any level horizontally. However, if the message, once used horizontally scalable, cannot be shrunk once the requirement is reduced, some process is actually taking place. That is one thing that is lacking.

I believe there are approximately 10 to 15 people who use it.

This is being used by the data migration, data team, data analytical team, and data engineer. It's being used by all application architects who are just looking into it, as well as middleware integrators and middleware application integrators.

We have big plans to increase the use of various other innovations and stuff like that. We are using it in relation to data activities. 

Also, we are only planning to use the financial part for publishing it, subscribing, and publishing a pop-up model for various use cases.

How are customer service and support?

Apache usually has a community deployment. If you use Apache or any other software, you will usually receive community support. Otherwise, some companies are taking it and beginning to process it. For example, in Kafka, there is a version of Confluent that they use and support. Or, as we call it, the Oracle Big Data platform.

It will be included with Hadoop, Spark, and other similar technologies. That is coming as, one of the back software packages that are part of that offering, and it is supported by Oracle. Depending on the type of open source, there are various types of support available. Other than the community, we will not receive assistance. Otherwise, it's free enterprise, and we can take it from Confluent or other vendors who offer similar products.

Which solution did I use previously and why did I switch?

Prior to implementing this solution, we were not using another solution. We have been using, Kafka from the beginning with regard to these use cases. However, we are using other queuing solutions, such as MQ, ActiveMQ, IBM IQ, and Q, but the use cases are different. This is primarily due to the large volume, faster processing, and other benefits of using Kafka.

How was the initial setup?

It is not deployed on-premises. 

We use Kafka as part of the OCI Oracle Cloud platform and the Oracle Big Data platform because Kafka is included.

The Apache Kafka setup will take some time because it is not simple, and we have a lot of other components to install. It's fine because we needed all the plugins and other things for the simple implementations, but the containers' implementation is simple. The only difference is that when it comes to Zookeeper, there are a lot of supporting applications running on top of it, such as Zookeeper. As part of their area, Apache Kafka is running on top of Zookeeper. What do they think? As part of their... manageability, the Kafka area, and Apache Zookeeper. As a result, everything must be removed. And it will be preferable if the implementation is simple.  I believe Confluent is doing this, but we have not yet begun.

The deployment, and configuration, will take one hour to complete. However, it is also dependent on the fact that you require a large number of configurations, which we have.

What about the implementation team?

The deployment was completed in-house.

Currently, there is a team of three to maintain this solution. There are application support personnel in charge of access control.

What's my experience with pricing, setup cost, and licensing?

It will be included in the Oracle-specific platform. It is approximately $600,000 USD.

What other advice do I have?

When it comes to Apache Kafka, they must understand how it works and what its internals are. There could be numerous challenges associated with the product and its entire life cycle. You will have to have a good understanding and knowledge of the configuration. You will need a technical person who is knowledgeable in Kafka which will be an advantage and on an ongoing life partner.

It's a very good solution, I would rate Apache Kafka a nine out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user

The high availability is valuable. It is robust, and we can rely on it for a huge amount of data.

Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros sharing their opinions.
Updated: November 2024
Product Categories
Streaming Analytics
Buyer's Guide
Download our free Apache Kafka Report and get advice and tips from experienced pros sharing their opinions.