What needs improvement with Apache Kafka?

Apache Kafka is an open-source distributed streaming platform that serves as a central hub for handling real-time data streams. It allows efficient publishing, subscribing, and processing of data from various sources like applications, servers, and sensors. Kafka's core benefits include high scalability for big data pipelines, fault tolerance ensuring continuous operation despite node failures, low latency for real-time applications, and decoupling of data producers from consumers. Key...

Download Apache Kafka Report Read more

Related Q&As

Sep 4, 2023

What are the differences between Apache Kafka and IBM MQ?

Apr 6, 2020

What are the pros and cons of Apache Kafka?

Snehasish Das Technology Leader at eTCaaS · Answer 1 · 2024-12-26T16:29:00Z

In the data sharing space, the performance of Apache Kafka could be improved. The performance angle is critical, and while it works in milliseconds, the goal is to move towards microseconds.

Kemal Duman Team Lead, Data Engineering at Nesine.com · Answer 2 · 2024-12-02T12:37:00Z

Config management can be better. We are always trying to find the best configs, which is a challenge.

Rotem Fogel R&D Director at Coralogix · Answer 3 · 2024-11-18T07:42:00Z

Kafka requires fine-tuning to find the best architecture, number of nodes, and partitions for your use case. It’s a trial-and-error process with no one-size-fits-all solution. Issues may arise until it’s appropriately tuned. While it can scale out efficiently, scaling down is more challenging, making deleting data or reducing activity harder.

score 0 · Answer 4 · 2024-10-25T14:12:00Z

Confluent has improved aspects like documentation and cloud support, yet Kafka's reliance on older architectures like ZooKeeper in previous versions is a limitation. Its language and architecture could be further improved to solve issues in consensus algorithms, as Red Panda does.

Eyob Alemu Technical Director at NIDP · Answer 5 · 2024-10-02T14:30:00Z

Kafka has some limitations in terms of queue management. Specifically, it lacks the capability to handle larger queues for external system interactions. It would be beneficial if Kafka included more robust, high-capacity queue management features for integration with external systems.

score 0 · Answer 6 · 2024-05-02T10:25:11Z

One complexity that I faced with the tool stems from the fact that since it is not kind of a stand-alone application, it won't integrate with native cloud, like AWS or Azure. Apache Kafka has another mask on it, so if users can have a direct service, like Grafana, that can actually be used as a stand-alone tool with Grafana cloud, or you can use a mix of AWS and Grafana, so there is not much difference with it. I expect Apache Kafka to have Grafana's same nature. The product's support and the cloud integration capabilities are areas of concern where improvements are required.

Lucas Dreyer Data Engineer at BBD · Answer 7 · 2024-04-30T08:08:00Z

The main challenge we faced while integrating Apache Kafka with other tools was setting up SSL and securing connections. Managing certificate changes and ensuring all clients connect smoothly, especially outside Kubernetes environments, posed ongoing challenges. Once initially set up, maintaining and sharing these security configurations became more manageable, but ensuring compatibility across different environments remained a continuous effort.

Bharath-Reddy Architect at Tekgeminus · Answer 8 · 2024-01-17T09:41:49Z

Bharath-Reddy

Architect at Tekgeminus

Real User

Top 5

Jan 17, 2024

Apache Kafka has performance issues that cause it to lag.

score 0 · Answer 9 · 2023-09-13T09:37:10Z

In Apache Kafka, it is currently difficult to create a consumer. The implementation of Apache Kafka's features, like rebalancing, is possible only when you create a consumer, which is a very difficult task since it is overly complicated. To create a consumer in Apache Kafka, a person needs to have a very strong knowledge of the internal functioning of Apache. I feel that Kaka needs to provide a consumer so that its users don't spend time in the creation of consumers. In general, Apache Kafka must provide users with a more user-friendly UI.

Jhon Rico Senior Solutions Architect at BVC · Answer 10 · 2023-09-08T20:53:00Z

Maintaining and configuring Apache Kafka can be challenging, especially when you want to fine-tune its behavior. It involves configuring traffic partitioning, understanding retention times, and dealing with various variables. Monitoring and optimizing its behavior can also be difficult. Perhaps a more straightforward approach could be using messaging queues instead of the publish-subscribe pattern. Some solutions may not require the complex features of Apache Kafka, and a messaging queue with Kafka's capabilities might provide a more complete messaging solution for events and messages.

Mohamed BENTAHAR Architect at Agence Française de Développement · Answer 11 · 2023-05-11T13:52:00Z

Kafka is a new method we opted to apply to our need for data exchange. Also, we use the solution's integration capabilities. Irovement-wise, I would like the solution to have more integration capabilities. Also, the solution's setup, which is currently complex, should be made easier.

reviewer2075460 Group Manager at a media company with 201-500 employees · Answer 12 · 2023-04-25T09:46:00Z

One of the major areas for improvement, which I have to check out, is their pulling mechanism. Sometimes, when the data volume is too huge, and I have a pulling period of, let's say, one minute, there can be issues due to technical glitches, data anomalies, or platform-related issues such as cluster restarts. These polling periods tend to stop messaging use, and the restart ability part needs to be improved, especially when data volumes are too high. If there are obstructions due to technical glitches or platform issues, sometimes we have to manually clean up or clear the queue before it eventually gets sealed. It doesn't mean it doesn't get restarted on its own, but it takes too much time to catch up. At that point, one year ago, I couldn't find a solution to make it more agile in terms of catching up quickly and showing that it is real-time in case of any downtime. This was one area where I couldn't find a solution when I connected with Cloudera and Apache. One of our messaging tools was sending a couple of million records. We found it tough when there were any cluster downtimes or issues with the subscribers consuming data. For future releases, one feature I would like to see is a more robust solution in terms of restart ability. It should be able to handle platform issues and data issues and restart seamlessly. It should not cause a cascading effect if there is any downtime. Another feature that would be helpful is if they could add monitoring features as they have for their other services. A UI where I can monitor the capacity of the managed queue and resources I need to utilize more to make it ready for future data volumes. It would be great to have analytics on the overall performance of Kafka to plan for data volumes and messaging use. Currently, we plan the cluster resources manually based on data volumes for Kafka. If they can have a UI for resource planning based on data volume, that could be a great addition.

score 0 · Answer 13 · 2023-03-09T22:01:20Z

reviewer2116086

Senior Developer at a financial services firm with 10,001+ employees

Real User

Top 20

Mar 9, 2023

There are some latency problems with Kafka.

Dimitrios Zigkos Enterprise Architect at Smals vzw · Answer 14 · 2023-02-06T15:03:21Z

There have been some challenges with monitoring Apache Kafka, as there are currently only a few production-grade solutions available, which are all under enterprise license and therefore not easily accessible. The speaker has not had access to any of these solutions and has instead relied on tools, such as Dynatrace, which do not provide sufficient insight into the Apache Kafka system. While there are other tools available, they do not offer the same level of real-time data as enterprise solutions. One additional area that I think could benefit from improvement is the deployment process on OpenShift. This particular deployment is quite challenging and requires the activation of certain security measures as well as integration with other systems. It's not a straightforward process and typically requires engineers who are highly skilled and have extensive experience with Apache Kafka to carry out these tasks. Therefore, I believe that there is a need for progress in this area, and some tools that can provide information, assistance, and help make the whole process easier would be greatly appreciated.

Silvio Lucas Pereira Filho Senior Tech Lead at RecargaPay · Answer 15 · 2023-01-13T20:15:29Z

Apache Kafka can improve by adding a feature out of the box which allows it to deliver only one message.

Abdul-Samad Software Engineer at a tech services company with 201-500 employees · Answer 16 · 2023-01-12T14:36:00Z

The interface has room for improvement, and there is a steep learning curve for Hadoop integration. It was a struggle learning to send from Hadoop to Kafka. In future releases, I'd like to see improvements in ETL functionality and Hadoop integration.

score 0 · Answer 17 · 2022-12-06T15:50:00Z

Kafka contains two components. The component that does the synchronization between the rest of the components, that's an older version of the software and it causes all kinds of configuration problems. The Confluent, which is the company that sells a commercial version of Kafka is getting away from that component precisely because of that. Kafka is a nightmare to administer. In the next release, I would like to see that one troublesome component that causes configuration issues removed.

Nitin Kamble Director at Tibco · Answer 18 · 2022-11-09T14:44:38Z

NK

Nitin Kamble

Director at Tibco

Real User

Nov 9, 2022

The solution can improve its cloud support.

AbhishekGupta Engineering Leader at Walmart · Answer 19 · 2022-10-08T02:36:00Z

AbhishekGupta

Engineering Leader at Walmart

Real User

Oct 8, 2022

Apache Kafka could improve data loss and compatibility with Spark.

Jack Angoe Technical Lead at Interface Fintech Ltd · Answer 20 · 2022-10-07T11:43:46Z

I would like them to reduce the learning curve around the creation of brokers and topics. They also need to improve on the concept of the partitions. As for features, RabbitMQ has an instant response feature where you can send a queue and get an instant response, but Kafka only has one way to send queues. If that's something they could improve on, it would be great.

Stuart-Cook CEO /Consultant at Version Two Software Solutions Ltd · Answer 21 · 2022-10-06T14:58:58Z

We struggled a bit with the built-in data transformations because it was a challenge to get them up and running the way we wanted. There was a bit of a learning curve. It may be that we didn't fully grasp the information. Also, the documentation covering certain aspects was a bit poor. We had to trawl around different locations to try to find what we needed. When we were able to find documentation on transformation, for example, there wasn't a good set of documentation examples we could use, and the examples we had weren't quite meeting the need. Better examples would've helped us.

Nor EL MALKI Project Manager at Leyton & Associés, SAS · Answer 22 · 2022-09-14T11:07:32Z

Apache Kafka can improve by providing a UI for monitoring. There are third-party tools that can do it, but it would be nice if it was already embedded within Apache Kafka.

Rémy NOLLET Data Exchange Architect MQSeries at Decathlon International · Answer 23 · 2022-07-20T13:35:00Z

Rémy NOLLET

Data Exchange Architect MQSeries at Decathlon International

Real User

Jul 20, 2022

I would like to see an improvement in authentication management.

SunilKalva Barista Brewing Espresso at Linkedln · Answer 24 · 2022-07-07T06:41:00Z

When compared to other commercial competitors, Kafka doesn't have the ability to scale down, the elasticity is lacking in the product. The other issue for us is the delayed queue, which was available to us in the commercial software but not in Kafka. It's something we use in most of our applications for deferred processing and I know it's available in other solutions. I'd like to see some tooling support and language support in the open source version.

Reza Sadeghi Software Development Team Lead at asa com · Answer 25 · 2022-06-26T13:43:00Z

The user interface is one weakness. Sometimes, our data isn't as accessible as we'd like. It takes a lot of work to retrieve the data and the index.

Salvatore Campana CEO & Founder at Xautomata · Answer 26 · 2022-04-27T08:18:51Z

To store a large set of analytical data we are using SQL repository. This type of repository works very well but we need specific and high maintenance. The user experience is friendly. We are looking for alternative solutions, we tried with noSQL solutions and Confluent specific features but the results were not satisfactory both in terms of performance and usability. We are working on automated SQL repository management and maintenance tools in order to increase the democratization of our platform.

ShoaibKhan Technical Specialist at APIZone · Answer 27 · 2022-03-30T12:56:54Z

The management overhead is more compared to the messaging system. There are challenges here and there. Like for long usage, it requires restarts and nodes from time to time.

Rene Haburai Freelance at SÍŤ spol. s.r.o. · Answer 28 · 2022-02-17T19:05:04Z

Kafka has a lot of monitors, but sometimes it's most important to just have a simple monitor. Improvements to Kafka's management would be nice, but it's not so necessary for me. There are a lot of consoles that offer a better view than Kafka. Some are free, and some are paid, but I'm thinking about streaming. For example, if you connect more streams to a component in the same queue, how will it integrate to recognize the flow and the message?

reviewer1128858 Vice President at Anchorage · Answer 29 · 2022-02-16T18:19:00Z

More Windows support, I believe, is one area where it can improve. We need to wrap it as a service, but there isn't one built into Windows. So that's something they could improve. I believe Windows Server is primarily aimed at the Windows shop or those who use Windows.

Andrea Castorino Program Manager at SirfinPA · Answer 30 · 2021-12-15T20:27:56Z

Andrea Castorino

Program Manager at SirfinPA

Real User

Top 5

Dec 15, 2021

The management tool could be improved.

reviewer1052868 Principal Technology Architect at Infosys · Answer 31 · 2021-11-23T03:09:18Z

We are still on the production aspect, with our service provider or hyper-scalers providing the solutions. I would like to see some improvement on the HA and DR solutions, where everything is happening in real-time. Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that.

Mario Estrada CTO at Estrada & Consultores · Answer 32 · 2021-08-07T05:24:00Z

Kafka can allow for duplicates, which isn't as helpful in some of our scenarios. They need to work on their duplicate management capabilities but for now developers should ensure idempotent operations for such scenarios. While the solution scales well and easily, you need to understand your future needs and prep for the peaks.

score 0 · Answer 33 · 2021-06-26T01:12:49Z

I would like to see real-time event-based consumption of messages rather than the traditional way through a loop. The traditional messaging system works by listing and looping with a small wait to check to see what the messages are. A push system is where you have something that is ready to receive a message and when the message comes in and hits the partition, it goes straight to the consumer versus the consumer having to pull. I believe this consumer approach is something they are working on and may come in an upcoming release. However, that is message consumption versus message listening. Confluent created the KSQL language, but they gave it to the open-source community. I would like to see KSQL be able to be used on raw data versus structured and semi-structured data.

score 0 · Answer 34 · 2021-05-12T12:29:06Z

The graphical user environment is currently lacking in Apache. It's not available within the solution and needs to be built from scratch. Some of the open source products of this solution have limitations.

score 0 · Answer 35 · 2021-02-09T13:56:41Z

The initial setup and deployment could be less complex. Integration is one of the main concerns that we have.

score 0 · Answer 36 · 2020-10-22T16:39:00Z

Some vendors don't offer extra features for monitoring. Some come with Linux for default monitoring. Monitoring is very important. If something is not working properly, then our subscribers won't receive a notification. You then have to trace it back to Kafka and find the glitch or the messaging sequence that hasn't been racked up correctly. It should support Avro — which handles different data formats — as a default data format. It would be much more flexible if it did.

score 0 · Answer 37 · 2020-09-27T04:09:51Z

They need to have a proper portal to do everything because, at this moment, Kafka is lagging in this regard. It could be used to do the preprocessing or the configurations, instead of directly doing it on the queues or the topics. If you look at Solace, for example, they have come up with a portal where you don't need to touch these activities. You don't need to access the platform beyond the portal.

Rene Haburai Freelance at SÍŤ spol. s.r.o. · Answer 38 · 2020-08-19T07:57:37Z

The model where you create the integration or the integration scenario needs improvement. It contains fewer developer words or maintaining words where someone prepared the topics, the connectors, or the streaming platforms. You would first need to have a control center from a third party for managing. If you would like to prepare something that is a more sophisticated integration scenario, where you use one microservice to provide the event or a second to several that consumed these microservices, then this needs to be modeled elsewhere. Also, when comparing to the traditional ESD for data mixing, you can create a scenario that could be deployed with inputs and some outputs. Most business like the topics, but for me, I think that it is a problem that messaging platforms have, there is no design tool with IDE for creating. It would be helpful to create a more complex solution for several types of styles, and not just for one provider or for one customer. That would be easier, but if you have more than one consumer then it could be a more complex scenario. It would be like events that go to several microservers to create orders, validate orders, and creating words. This would be helpful. In the next release, adding some IDE or developing tools, for creating better integration scenarios, even though it already a developer-oriented solution, would be helpful. It would also be helpful for the auto-deployment. Having a governance style would also be helpful to understand. It would be beneficial to have a repository of all of the topics, data types that exist, or data structures.

reviewer1304505 Senior Consultant at instaclustr · Answer 39 · 2020-06-28T08:51:00Z

Due to the fact that the solution is open source, it has a zookeeper dependency. If I could change anything about the solution, it would be that. The solution could always add a few more features to enhance its usage.

AhmadMasamreh IBMi/MIMIX Administrator at Arab Bank · Answer 40 · 2020-04-19T07:40:31Z

This solution could be made easier to manage. Compatibility with other solutions and integration with other tools can be improved. We cannot apply all of our security requirements because it is hard to upload them.

Mukulit Bhati CTO at InsightGeeks Solutions Pvt. · Answer 41 · 2020-04-05T09:13:00Z

The manageability should be improved. There are lots of things we need to manage and it should have a function that enables us to manage them all cohesively. There should be a default property. It's really hard to manage all these things.

it_user998961 Enterprice Architect · Answer 42 · 2020-03-30T07:58:09Z

it_user998961

Enterprice Architect

Real User

Mar 30, 2020

More adapters for connecting to different systems need to be available.

Lakshmanan Panneerselvam Owner at Binarylogicworks.com.au · Answer 43 · 2020-03-30T07:58:07Z

LP

Lakshmanan Panneerselvam

Owner at Binarylogicworks.com.au

Real User

Mar 30, 2020

Kafka is complex and there is a little bit of a learning curve.

MoulaliNaguri Project Engineer at Wipro Limited · Answer 44 · 2020-02-03T09:10:16Z

We're still going through the solution. Right now, I can't suggest any features that might be missing. I don't see where there can be an improvement in that regard. The speed isn't as fast as RabbitMQ, even though the solution touts itself as very quick. It could be faster. They should work to make it at least as fast as RabbitMQ. The UI is based on command line. It would be helpful if they could come up with a simpler user interface. They should make it easier to configure items on the solution. The solution would benefit from the addition of better monitoring tools.

RABBAHMahmoud Senior Technical Architect at RABBAH SOFT · Answer 45 · 2020-01-29T11:22:00Z

In the next release, I would like for there to be some authorization features and HTL security. We also need bigger software and better monitoring.

OnurTokat Senior Big Data Developer | Cloudera at Dilisim · Answer 46 · 2020-01-19T06:38:00Z

If the graphical user interface was easier for the Kafka administration it would be much better. Right now, you need to use the program with a command-line interface. If the graphical user interface was easier, it could be a better product.

reviewer1247268 Technology Lead at Infosys · Answer 47 · 2020-01-12T07:22:00Z

Kafka does not provide control over the message queue, so we do not know whether we are experiencing lost or duplicate messages. Better control over the message queue would be an improvement. Solutions such as ActiveMQ do afford better control. Because of this, there is sometimes a gap in the results where we have either lost messages, or there are duplicates. We have had problems when there was an imbalance because all of the messages were being sent back.

SergeyGoncharov Developer Infrastructure at Outbrain · Answer 48 · 2019-12-30T06:53:00Z

There is a feature that we're currently using called MirrorMaker. We use it to combine the information from different Kafka servers into another server. It's very wide and it gives a very generic scenario. I think it would be great if the possibility would exist out of the box and not as a third party. The third party is not very stable and sometimes you have problems with this component. There are some developments in newer versions and we're about to try them out, but I'm not sure if it closes the gap.

Johnnie Li Technical Consultant at KPMG · Answer 49 · 2018-09-11T08:28:00Z

Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc.