What advice do you have for others considering Apache Flink?

Apache Flink is an open-source batch and stream data processing engine. It can be used for batch, micro-batch, and real-time processing. Flink is a programming model that combines the benefits of batch processing and streaming analytics by providing a unified programming interface for both data sources, allowing users to write programs that seamlessly switch between the two modes. It can also be used for interactive queries. Flink can be used as an alternative to MapReduce for executing...

Download Apache Flink Report Read more

Related Q&As

May 31, 2024

What is your experience regarding pricing and costs for Apache Flink?

Feb 5, 2024

What do you like most about Apache Flink?

score 0 · Answer 1 · 2024-05-31T13:44:00Z

The solution is is difficult to manage and handle. Apache Flink is one of the main solutions for real-time decision-making and rapid site provision on Azure. Currently, it's predominantly utilized for data engineering projects rather than AI initiatives, although it can indirectly influence analytics, machine learning, and other products in the stack. The solution is resilient and store more capabilities and features. Overall, I rate the solution an eight out of ten.

Agustin Calderon CTO at ReNew · Answer 2 · 2024-02-05T16:58:00Z

AC

Agustin Calderon

CTO at ReNew

Real User

Top 5

Feb 5, 2024

I rate the product an eight out of ten.

PrashantVaghela Principal Engineer at InnovAccer Inc. · Answer 3 · 2023-11-20T13:31:49Z

Depending on the use case, we use the appropriate framework. For certain use cases, it's an excellent choice. However, other use cases might require a different framework, such as Lambda or Spark Streaming. So, the choice of framework depends on the specific requirements of the task. When it comes to real-time ETL and real-time transformation, I would rate Flink very highly, an eight out of ten.

JAMAL AL MAHAMID CEO at Palmira · Answer 4 · 2021-08-17T10:03:54Z

Today, Flink is the fastest Streaming solution. It is the core of an Azure and Google streaming offering today.

Yes, it depends on your use case. So if you can accept a little delay, go for Kafka or Spark because it is easier to find a team that knows Kafka.

ZHIZHENG Product Operations Manager at OKX · Answer 5 · 2023-03-09T22:00:05Z

ZHIZHENG

Product Operations Manager at OKX

Real User

Mar 9, 2023

I would recommend Apache Flink to other users and rate it seven out of ten.

Sunil Morya Consultant at Tata Consultancy · Answer 6 · 2022-11-18T14:49:28Z

I find this solution very handy. Prior to using Flink I had experience on audio and video data streaming. so I don't know how useful Flink is when you want to do real-time analytics for audio and video data. I think if real-time analytics could be supported by Flink, that would be good. I rate this solution seven out of 10 based on the fact that I haven't used all of Flink's features.

Ilya Afanasyev Senior Software Development Engineer at Yahoo! · Answer 7 · 2022-08-03T05:21:00Z

I rate this solution a nine out of ten. I would recommend Apache Flink to new users. In my opinion, it is possible to move from Spark to Apache Flink. Apache Flink's functionality overlaps Spark's functionality. The solution is good, but the debugging process could be improved.

Ertugrul Akbas Manager at ANET · Answer 8 · 2021-07-29T15:57:58Z

I would recommend Apache Flink to others who are interested in using it. I would rate this solution an eight out of ten.

score 0 · Answer 9 · 2021-03-03T20:13:19Z

My advice to others when using Apache Flink is to hire good people to manage it. When you have the right team, it's very easy to operate and scale big data platforms. I would rate Apache Flink a nine out of ten.

score 0 · Answer 10 · 2021-02-02T17:14:03Z

When choosing this solution you have to look at your use case to see if this is the best choice for you. If you need to have super-fast realtime streaming, and you can develop in Scala, then it might make a lot of sense to use it. If you are looking at delays of seconds, and you are working on Python, then Pyspark might be a better solution. I rate Apache Flink a six out of ten.

score 0 · Answer 11 · 2020-11-08T16:21:05Z

My advice would be to make sure you understand your requirements, flink's architecture, how it works and whether it is the right solution for you. They provide very good documentation which is useful. The solution isn't suitable for every case and it may be that Spark or some other framework is more suitable. If you are a major company that cannot afford any downtime, and given that Flink is a relatively new technology, it might be worthwhile investing in the monitoring. That would include writing scripts for monitoring and making sure that the throughput of the applications is always steady. Make sure your monitoring and your SOPs around monitoring, are in place. I would rate this solution a seven out of 10.

score 0 · Answer 12 · 2020-10-21T04:33:00Z

Flink is really simple and simple to adopt. You can use any backend state management tools, like DB or something of that sort. it has the visibility to integrate with different technologies, that's also very important. It's pretty welded and I believe for low latency. The API is pretty well written that way to support you. I would rate Apache Flink an eight out of ten.

score 0 · Answer 13 · 2020-10-19T09:33:00Z

This is general advice if you're trying to do anything: Any problem that you're trying to evaluate, you have to really understand the problem that you're trying to solve, what is the nature of the problem? And by nature of the problem, the business side is one thing, but you have to understand how you're solving things. For example, do you want something to be fast enough, scalable and for any new product? Every time they advertise it is fast, scalable, highly distributed, etc... But in what context? What kind of use cases is this product built for? You have to understand the principle and only then you choose a product. If you want Apache Flink, it's about if you want something for near-real time metrics that may be useful for your business. In that case, Apache Flink is your friend, because it's built on streaming architecture. If the nature of your application or your business is streaming, the data is coming at a very high rate and you want to do something with it, then Apache Flink is a good option. Another example I can give you: let's say you run a company, you are the CEO of Twitter, right? So in Twitter, a lot of people are writing a lot of stuff. A lot of streaming data is coming in. Because a lot of people are tweeting at the same time all around the world there's a lot of streaming of data coming in. Let's say you're a celebrity and 5,000 people follow you. When you write a tweet, all 5,000 people have to see that tweet as quickly as possible. So when your tweet comes in, a very complex system from Twitter's backend has to take that tweet, has to know which of those people and display it on their feed timeline. Now this might sound easy when you only have five people, but if you have 315 million people tweeting, it's a very complex system and you have to make it available, etc... So when you're dealing with streaming data Apache Flink is a good option. On a scale of one to ten, I would rate Apache Flink around seven to eight. It's pretty good if you're solving a streaming type of problem. My experience is limited. I only worked with Apache Storm a little bit and Apache Flink. Among all of this, if I would talk about streaming, Apache Flink wins hands down, but there are other products like Apache Pulsar which I have no idea. So my perspective is very limited.

score 0 · Answer 14 · 2020-10-13T07:21:29Z

To get your hands wet on streaming or big data processing applications, to understand the basic concepts of big data processing and how complex analytics or complications can be made simple. For eg: If you want to analyze tweets or patterns, its a simple use case where you just use flink-twitter-connector and provide that as your input source to Flink. The stream of random tweets keeps on coming and then you can apply your own grouping, keying, filtering logic to understand their concepts. An important thing I learned while using flink, is basic concepts of windowing, transformation, Data Stream API should be clear, or atleast be aware of what is going to be used in your application, or else you might end up increasing the time rather than decreasing. You should also understand your data, process, pipeline, flow, Is flink the right candidate for your architecture or an over kill? It is flexible and powerful is all I can say.

Jyala Rahul Jyala Sr Software Engineer at a tech vendor with 10,001+ employees · Answer 15 · 2020-10-13T07:21:29Z

We are very happy with the product, and we have been able to achieve all of the use cases that we are expected to deliver for our customers. Over time, I have seen many improvements including in the documentation. An example is that when we first started using this product, almost two years ago, there was no support available. At this point, we do not have much opt-in but we have some use cases to ensure that our system is not breaking. We have QA who can validate these things based on what is expected versus what we have done. My advice for anybody who is considering Flink is that it has very mature documentation and you can do what you want. It is a very good way to implement streaming pipelines and you won't have any problems. The biggest lesson that I have learned from using Flink is how we can customize the experience for the customer and how important it is to keep up with the industry. We don't want to be left behind. I would rate this solution a seven out of ten.

Sandesh Deshmane Software Architect at a tech vendor with 501-1,000 employees · Answer 16 · 2020-10-07T07:04:00Z

My advice would be to validate your use case. If you are using already a streaming mechanism, I suggest that you validate what your actual use cases are and what the advantages of Flink are. Make sure that the use case that you are trying can be done by Flink. If you're doing simple aggregation and you don't want to worry about the message order then it's fine. You can use Storm or whatever you are using. If you see features that are there and are useful for you, then you should go for Flink. Validate your use case, validate your data and pipeline, do a small POC, and see if it is useful. If you think it's useful and worth doing a migration from your existing solution, then go for it. But if you don't already have a solution and Flink will be your first one, then it's always better to use Flink. The biggest lesson I have learned is that the deployment using Kubernetes was a little bit difficult. We did not evaluate when we started the work, so we migrated on the code part, but we did not take on the deployment part. Initially, if we would have seen the deployment part, then we could have chosen Kafka Streams as well because we were getting a similar result, but on the deployment side, Kafka Streams was easy. You don't need to worry about the cluster. I would rate Apache Flink an eight out of ten. I would have given it a nine or so if it wasn't for that the deployment on Kubernetes is a little bit complicated.