What is our primary use case?
We use Redpanda mainly for three purposes. The first is the standard Kafka-like data ingestion pipeline where you get an event and want to distribute it into multiple applications, not just one. The second, which is our main use case, involves capturing video recording data. When you consider how many video cameras exist even in a single city or metro, the volume becomes substantial. We capture what happened at each second, but we do not directly push it into the database. Instead, we pass it through Redpanda to ensure that everything follows a certain shape. Sometimes the data is corrupted, incomplete, or missing information. For example, an amount cannot be arbitrary; it has to be a specific number. We perform all those cleanups in our code while flowing the data through Redpanda with schemas in place. The third use case is backpressure management. If you want water to do the dishes but receive a fire hose instead, we ensure that you get the right amount of quantity. We keep this very simple, and after a certain point, it becomes a day-to-day operation.
Redpanda can handle real-time financial data streams and support aggregations. For example, with stock data where each tick represents a price point per second—such as Apple stock being 252.3, then 258, then 247—you can create means, medians, and averages. All these calculations work perfectly as long as your functions are pure. You get all the data without making additional API calls or database calls, flow it through Redpanda or any Kafka instance, and perform all kinds of calculations very quickly. The database is generally the slower component. Every single time we found it was almost always the database that gets slower or the way we normalize data causing issues. Redpanda never gave us any problems. I remember RabbitMQ used to give us a lot of problems, but all those problems are gone with Redpanda.
The video example I shared involves webcams. If you consider webcams as IoT devices, we use them that way and it just works. It is not real-time in the strictest sense, but it is almost real-time, which is good enough for our case.
What is most valuable?
One thing I really appreciate about Redpanda is that it is simple and available to host yourself. You do not have to pay money upfront to Confluent or have API contracts. Unless you want very high availability, you can just host it on your own server by running the Docker container and Redpanda works. This is very good. The other valuable feature is that migrating from one version to another has almost always been a very smooth experience. With Kafka, you have to manage a thousand things. These two are big benefits in Redpanda. The third feature is that the UI is slick. I would not say it is one of the best UIs in the world, but it is much better than what Kafka or Confluent comes up with out of the box.
What needs improvement?
One area for improvement is providing more examples. For instance, Redpanda could be more useful as a sink where you get the data and can directly push to S3. While this is possible through the API, there are better and faster ways to do it. You can make a million API calls and accomplish the task in one and a half hours, but the same thing can be done in ten minutes through other methods. These faster approaches are not documented in obvious places. You have to find information scattered across various blogs. Redpanda should collect all the good blogs and best practices and put them in their documentation. This is more about knowledge management and making it easy for users to understand the product for complex use cases. For simple use cases, it is straightforward. We all use the basic pipe functionality. However, providing more examples would be useful. For example, integration with AWS and the AWS ecosystem would be cool.
For how long have I used the solution?
We have been using Redpanda for at least the last four years.
What do I think about the stability of the solution?
Stability has been pretty good. I would rate it around eight or nine. Once in a while we restarted, but overall it is pretty stable.
What do I think about the scalability of the solution?
On scalability, we always scaled vertically. When 16 GB of RAM was not enough, we went to 32 GB RAM, then 64 GB RAM. We never scaled horizontally by adding one machine, then two machines, then three machines, and so forth. We just have a three-machine setup because if one machine fails, there is a backup, but it is not to distribute the load. In a way, it just works. I would say the scalability is around eight, nine, or ten. However, I do not know how well it scales horizontally. They obviously make big claims, but I never tried it because we just scale vertically.
Which solution did I use previously and why did I switch?
We previously evaluated Confluent and Kafka itself. I would not dare to go with Kafka because it requires way more orchestration than required. Confluent was good, and Redpanda was also good. However, Redpanda was an order of magnitude cheaper than Confluent, so we went with Redpanda.
How was the initial setup?
The initial setup for Redpanda is very smooth. Compared to Kafka, which feels like it requires a lot of knobs and pieces, Redpanda feels like a thirty-minute setup. This was actually one of the reasons why we wanted something we could just use rather than master first and then use. We started using Redpanda and then mastered it along the way because our use case is to flow it through the pipes, and that is done beautifully with Redpanda.
What about the implementation team?
We use Docker Swarm and deployed Redpanda on-premises.
What was our ROI?
The pricing for Redpanda is very good. I would rate it around nine.
Which other solutions did I evaluate?
We evaluated a couple of alternatives and liked three of them: Confluent, Redpanda, and others. However, Redpanda was simple and fast, so we went with Redpanda and it just works. Under the hood, it is just Kafka. They say that since it is written in C++ and Rust as a high-performance Kafka, it is fast. All the programming libraries we have are written for Kafka, and our Kafka is Redpanda. Since Redpanda provides compatibility with Kafka SDKs and APIs, we do not use Redpanda-specific SDKs or APIs. I cannot tell you how fast or slow Redpanda is compared to the original Kafka because the only Kafka instance I use is Redpanda. Redpanda is fast enough, but I cannot say that the original Kafka is slow because I never used the original Kafka.
What other advice do I have?
In my opinion, I am not sure about features for decision-making processes because I do not even know what kind of features are there for that. We just use Redpanda as a pipe. My overall rating for Redpanda is eight and a half out of ten.