Partner / Head of Data & Analytics at Intelligence Software Consulting
Real User
Top 5
2024-05-31T13:44:00Z
May 31, 2024
The solution is is difficult to manage and handle. Apache Flink is one of the main solutions for real-time decision-making and rapid site provision on Azure. Currently, it's predominantly utilized for data engineering projects rather than AI initiatives, although it can indirectly influence analytics, machine learning, and other products in the stack. The solution is resilient and store more capabilities and features. Overall, I rate the solution an eight out of ten.
Depending on the use case, we use the appropriate framework. For certain use cases, it's an excellent choice. However, other use cases might require a different framework, such as Lambda or Spark Streaming. So, the choice of framework depends on the specific requirements of the task. When it comes to real-time ETL and real-time transformation, I would rate Flink very highly, an eight out of ten.
I find this solution very handy. Prior to using Flink I had experience on audio and video data streaming. so I don't know how useful Flink is when you want to do real-time analytics for audio and video data. I think if real-time analytics could be supported by Flink, that would be good. I rate this solution seven out of 10 based on the fact that I haven't used all of Flink's features.
I rate this solution a nine out of ten. I would recommend Apache Flink to new users. In my opinion, it is possible to move from Spark to Apache Flink. Apache Flink's functionality overlaps Spark's functionality. The solution is good, but the debugging process could be improved.
Partner / Head of Data & Analytics at Intelligence Software Consulting
Real User
Top 5
2021-03-03T20:13:19Z
Mar 3, 2021
My advice to others when using Apache Flink is to hire good people to manage it. When you have the right team, it's very easy to operate and scale big data platforms. I would rate Apache Flink a nine out of ten.
Head of Data Science at a energy/utilities company with 10,001+ employees
Real User
2021-02-02T17:14:03Z
Feb 2, 2021
When choosing this solution you have to look at your use case to see if this is the best choice for you. If you need to have super-fast realtime streaming, and you can develop in Scala, then it might make a lot of sense to use it. If you are looking at delays of seconds, and you are working on Python, then Pyspark might be a better solution. I rate Apache Flink a six out of ten.
Software Development Engineer III at a tech services company with 5,001-10,000 employees
Real User
2020-11-08T16:21:05Z
Nov 8, 2020
My advice would be to make sure you understand your requirements, flink's architecture, how it works and whether it is the right solution for you. They provide very good documentation which is useful. The solution isn't suitable for every case and it may be that Spark or some other framework is more suitable. If you are a major company that cannot afford any downtime, and given that Flink is a relatively new technology, it might be worthwhile investing in the monitoring. That would include writing scripts for monitoring and making sure that the throughput of the applications is always steady. Make sure your monitoring and your SOPs around monitoring, are in place. I would rate this solution a seven out of 10.
Principal Software Engineer at a tech services company with 1,001-5,000 employees
Real User
2020-10-21T04:33:00Z
Oct 21, 2020
Flink is really simple and simple to adopt. You can use any backend state management tools, like DB or something of that sort. it has the visibility to integrate with different technologies, that's also very important. It's pretty welded and I believe for low latency. The API is pretty well written that way to support you. I would rate Apache Flink an eight out of ten.
Sr. Software Engineer at a tech services company with 10,001+ employees
Real User
2020-10-19T09:33:00Z
Oct 19, 2020
This is general advice if you're trying to do anything: Any problem that you're trying to evaluate, you have to really understand the problem that you're trying to solve, what is the nature of the problem? And by nature of the problem, the business side is one thing, but you have to understand how you're solving things. For example, do you want something to be fast enough, scalable and for any new product? Every time they advertise it is fast, scalable, highly distributed, etc... But in what context? What kind of use cases is this product built for? You have to understand the principle and only then you choose a product. If you want Apache Flink, it's about if you want something for near-real time metrics that may be useful for your business. In that case, Apache Flink is your friend, because it's built on streaming architecture. If the nature of your application or your business is streaming, the data is coming at a very high rate and you want to do something with it, then Apache Flink is a good option. Another example I can give you: let's say you run a company, you are the CEO of Twitter, right? So in Twitter, a lot of people are writing a lot of stuff. A lot of streaming data is coming in. Because a lot of people are tweeting at the same time all around the world there's a lot of streaming of data coming in. Let's say you're a celebrity and 5,000 people follow you. When you write a tweet, all 5,000 people have to see that tweet as quickly as possible. So when your tweet comes in, a very complex system from Twitter's backend has to take that tweet, has to know which of those people and display it on their feed timeline. Now this might sound easy when you only have five people, but if you have 315 million people tweeting, it's a very complex system and you have to make it available, etc... So when you're dealing with streaming data Apache Flink is a good option. On a scale of one to ten, I would rate Apache Flink around seven to eight. It's pretty good if you're solving a streaming type of problem. My experience is limited. I only worked with Apache Storm a little bit and Apache Flink. Among all of this, if I would talk about streaming, Apache Flink wins hands down, but there are other products like Apache Pulsar which I have no idea. So my perspective is very limited.
Lead Software Engineer at a tech services company with 5,001-10,000 employees
Real User
2020-10-13T07:21:29Z
Oct 13, 2020
To get your hands wet on streaming or big data processing applications, to understand the basic concepts of big data processing and how complex analytics or complications can be made simple. For eg: If you want to analyze tweets or patterns, its a simple use case where you just use flink-twitter-connector and provide that as your input source to Flink. The stream of random tweets keeps on coming and then you can apply your own grouping, keying, filtering logic to understand their concepts. An important thing I learned while using flink, is basic concepts of windowing, transformation, Data Stream API should be clear, or atleast be aware of what is going to be used in your application, or else you might end up increasing the time rather than decreasing. You should also understand your data, process, pipeline, flow, Is flink the right candidate for your architecture or an over kill? It is flexible and powerful is all I can say.
Sr Software Engineer at a tech vendor with 10,001+ employees
Real User
2020-10-13T07:21:29Z
Oct 13, 2020
We are very happy with the product, and we have been able to achieve all of the use cases that we are expected to deliver for our customers. Over time, I have seen many improvements including in the documentation. An example is that when we first started using this product, almost two years ago, there was no support available. At this point, we do not have much opt-in but we have some use cases to ensure that our system is not breaking. We have QA who can validate these things based on what is expected versus what we have done. My advice for anybody who is considering Flink is that it has very mature documentation and you can do what you want. It is a very good way to implement streaming pipelines and you won't have any problems. The biggest lesson that I have learned from using Flink is how we can customize the experience for the customer and how important it is to keep up with the industry. We don't want to be left behind. I would rate this solution a seven out of ten.
Software Architect at a tech vendor with 501-1,000 employees
Real User
2020-10-07T07:04:00Z
Oct 7, 2020
My advice would be to validate your use case. If you are using already a streaming mechanism, I suggest that you validate what your actual use cases are and what the advantages of Flink are. Make sure that the use case that you are trying can be done by Flink. If you're doing simple aggregation and you don't want to worry about the message order then it's fine. You can use Storm or whatever you are using. If you see features that are there and are useful for you, then you should go for Flink. Validate your use case, validate your data and pipeline, do a small POC, and see if it is useful. If you think it's useful and worth doing a migration from your existing solution, then go for it. But if you don't already have a solution and Flink will be your first one, then it's always better to use Flink. The biggest lesson I have learned is that the deployment using Kubernetes was a little bit difficult. We did not evaluate when we started the work, so we migrated on the code part, but we did not take on the deployment part. Initially, if we would have seen the deployment part, then we could have chosen Kafka Streams as well because we were getting a similar result, but on the deployment side, Kafka Streams was easy. You don't need to worry about the cluster. I would rate Apache Flink an eight out of ten. I would have given it a nine or so if it wasn't for that the deployment on Kubernetes is a little bit complicated.
Apache Flink is an open-source batch and stream data processing engine. It can be used for batch, micro-batch, and real-time processing. Flink is a programming model that combines the benefits of batch processing and streaming analytics by providing a unified programming interface for both data sources, allowing users to write programs that seamlessly switch between the two modes. It can also be used for interactive queries.
Flink can be used as an alternative to MapReduce for executing...
The solution is is difficult to manage and handle. Apache Flink is one of the main solutions for real-time decision-making and rapid site provision on Azure. Currently, it's predominantly utilized for data engineering projects rather than AI initiatives, although it can indirectly influence analytics, machine learning, and other products in the stack. The solution is resilient and store more capabilities and features. Overall, I rate the solution an eight out of ten.
I rate the product an eight out of ten.
Depending on the use case, we use the appropriate framework. For certain use cases, it's an excellent choice. However, other use cases might require a different framework, such as Lambda or Spark Streaming. So, the choice of framework depends on the specific requirements of the task. When it comes to real-time ETL and real-time transformation, I would rate Flink very highly, an eight out of ten.
I would recommend Apache Flink to other users and rate it seven out of ten.
I find this solution very handy. Prior to using Flink I had experience on audio and video data streaming. so I don't know how useful Flink is when you want to do real-time analytics for audio and video data. I think if real-time analytics could be supported by Flink, that would be good. I rate this solution seven out of 10 based on the fact that I haven't used all of Flink's features.
Today, Flink is the fastest Streaming solution. It is the core of an Azure and Google streaming offering today.
Yes, it depends on your use case. So if you can accept a little delay, go for Kafka or Spark because it is easier to find a team that knows Kafka.
I rate this solution a nine out of ten. I would recommend Apache Flink to new users. In my opinion, it is possible to move from Spark to Apache Flink. Apache Flink's functionality overlaps Spark's functionality. The solution is good, but the debugging process could be improved.
I would recommend Apache Flink to others who are interested in using it. I would rate this solution an eight out of ten.
My advice to others when using Apache Flink is to hire good people to manage it. When you have the right team, it's very easy to operate and scale big data platforms. I would rate Apache Flink a nine out of ten.
When choosing this solution you have to look at your use case to see if this is the best choice for you. If you need to have super-fast realtime streaming, and you can develop in Scala, then it might make a lot of sense to use it. If you are looking at delays of seconds, and you are working on Python, then Pyspark might be a better solution. I rate Apache Flink a six out of ten.
My advice would be to make sure you understand your requirements, flink's architecture, how it works and whether it is the right solution for you. They provide very good documentation which is useful. The solution isn't suitable for every case and it may be that Spark or some other framework is more suitable. If you are a major company that cannot afford any downtime, and given that Flink is a relatively new technology, it might be worthwhile investing in the monitoring. That would include writing scripts for monitoring and making sure that the throughput of the applications is always steady. Make sure your monitoring and your SOPs around monitoring, are in place. I would rate this solution a seven out of 10.
Flink is really simple and simple to adopt. You can use any backend state management tools, like DB or something of that sort. it has the visibility to integrate with different technologies, that's also very important. It's pretty welded and I believe for low latency. The API is pretty well written that way to support you. I would rate Apache Flink an eight out of ten.
This is general advice if you're trying to do anything: Any problem that you're trying to evaluate, you have to really understand the problem that you're trying to solve, what is the nature of the problem? And by nature of the problem, the business side is one thing, but you have to understand how you're solving things. For example, do you want something to be fast enough, scalable and for any new product? Every time they advertise it is fast, scalable, highly distributed, etc... But in what context? What kind of use cases is this product built for? You have to understand the principle and only then you choose a product. If you want Apache Flink, it's about if you want something for near-real time metrics that may be useful for your business. In that case, Apache Flink is your friend, because it's built on streaming architecture. If the nature of your application or your business is streaming, the data is coming at a very high rate and you want to do something with it, then Apache Flink is a good option. Another example I can give you: let's say you run a company, you are the CEO of Twitter, right? So in Twitter, a lot of people are writing a lot of stuff. A lot of streaming data is coming in. Because a lot of people are tweeting at the same time all around the world there's a lot of streaming of data coming in. Let's say you're a celebrity and 5,000 people follow you. When you write a tweet, all 5,000 people have to see that tweet as quickly as possible. So when your tweet comes in, a very complex system from Twitter's backend has to take that tweet, has to know which of those people and display it on their feed timeline. Now this might sound easy when you only have five people, but if you have 315 million people tweeting, it's a very complex system and you have to make it available, etc... So when you're dealing with streaming data Apache Flink is a good option. On a scale of one to ten, I would rate Apache Flink around seven to eight. It's pretty good if you're solving a streaming type of problem. My experience is limited. I only worked with Apache Storm a little bit and Apache Flink. Among all of this, if I would talk about streaming, Apache Flink wins hands down, but there are other products like Apache Pulsar which I have no idea. So my perspective is very limited.
To get your hands wet on streaming or big data processing applications, to understand the basic concepts of big data processing and how complex analytics or complications can be made simple. For eg: If you want to analyze tweets or patterns, its a simple use case where you just use flink-twitter-connector and provide that as your input source to Flink. The stream of random tweets keeps on coming and then you can apply your own grouping, keying, filtering logic to understand their concepts. An important thing I learned while using flink, is basic concepts of windowing, transformation, Data Stream API should be clear, or atleast be aware of what is going to be used in your application, or else you might end up increasing the time rather than decreasing. You should also understand your data, process, pipeline, flow, Is flink the right candidate for your architecture or an over kill? It is flexible and powerful is all I can say.
We are very happy with the product, and we have been able to achieve all of the use cases that we are expected to deliver for our customers. Over time, I have seen many improvements including in the documentation. An example is that when we first started using this product, almost two years ago, there was no support available. At this point, we do not have much opt-in but we have some use cases to ensure that our system is not breaking. We have QA who can validate these things based on what is expected versus what we have done. My advice for anybody who is considering Flink is that it has very mature documentation and you can do what you want. It is a very good way to implement streaming pipelines and you won't have any problems. The biggest lesson that I have learned from using Flink is how we can customize the experience for the customer and how important it is to keep up with the industry. We don't want to be left behind. I would rate this solution a seven out of ten.
My advice would be to validate your use case. If you are using already a streaming mechanism, I suggest that you validate what your actual use cases are and what the advantages of Flink are. Make sure that the use case that you are trying can be done by Flink. If you're doing simple aggregation and you don't want to worry about the message order then it's fine. You can use Storm or whatever you are using. If you see features that are there and are useful for you, then you should go for Flink. Validate your use case, validate your data and pipeline, do a small POC, and see if it is useful. If you think it's useful and worth doing a migration from your existing solution, then go for it. But if you don't already have a solution and Flink will be your first one, then it's always better to use Flink. The biggest lesson I have learned is that the deployment using Kubernetes was a little bit difficult. We did not evaluate when we started the work, so we migrated on the code part, but we did not take on the deployment part. Initially, if we would have seen the deployment part, then we could have chosen Kafka Streams as well because we were getting a similar result, but on the deployment side, Kafka Streams was easy. You don't need to worry about the cluster. I would rate Apache Flink an eight out of ten. I would have given it a nine or so if it wasn't for that the deployment on Kubernetes is a little bit complicated.