There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible. So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.
Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time. There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored. We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved. In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.
The use case templates could be more precise to typical business needs. Available templates and model workflows are very high-level so don't really match real needs. It would help to have templates that allow us to see business opportunities. It would help to be able to copy workflow to another device rather than having to ingest it.
The overall stability of this solution could be improved. In a future release, we would like to have access to more features that could be used in a parallel way. This would provide more freedom with processing.
Senior Solutions Architect/ Software Architect at a comms service provider with 51-200 employees
Real User
2020-12-22T17:16:36Z
Dec 22, 2020
The challenge with Apache NiFi is that it's not cloud-native. This makes it different from our workflow. The operations are over-complicated and when you build your pipeline, it's a nightmare to follow them. Then, as your pipeline or workflow becomes more complex, the operation of it gets worse. It is not easy to use and it requires a bigger ramp-up than any other solution that we have seen.
There should be a better way to integrate a development environment with local tools. Most of the development is done on the console. For example, in Spark, we can develop on our local desktop and then deploy it to another environment. The integration using this solution is not simple. Eventually, we can implement a local environment on our machines that is web-oriented and we have a browsing console to do it. At times, it is not easy to integrate with other components that could be a part of the entire solution in terms of development. There are issues with stability due to memory. It would be good to include a lock or an alarm to detect or alert you that it needs more resources. If the solution could be integrated more it would repair every part of the flow. Eventually, if we are integrated with other technology, we need to have a notion that we need to plan and have the correct sizing. We could implement an ecosystem that could scale with the requirements. In the next release, I would like to see the support of monitorization with the interface. Also an integrated development environment.
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
The tool should incorporate more tutorials for advanced use cases. It has tutorials for simple use cases.
There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible. So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.
Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time. There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored. We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved. In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.
More features must be added to the product. As compared to Kafka, the tool must be improved.
The use case templates could be more precise to typical business needs. Available templates and model workflows are very high-level so don't really match real needs. It would help to have templates that allow us to see business opportunities. It would help to be able to copy workflow to another device rather than having to ingest it.
The overall stability of this solution could be improved. In a future release, we would like to have access to more features that could be used in a parallel way. This would provide more freedom with processing.
The challenge with Apache NiFi is that it's not cloud-native. This makes it different from our workflow. The operations are over-complicated and when you build your pipeline, it's a nightmare to follow them. Then, as your pipeline or workflow becomes more complex, the operation of it gets worse. It is not easy to use and it requires a bigger ramp-up than any other solution that we have seen.
There should be a better way to integrate a development environment with local tools. Most of the development is done on the console. For example, in Spark, we can develop on our local desktop and then deploy it to another environment. The integration using this solution is not simple. Eventually, we can implement a local environment on our machines that is web-oriented and we have a browsing console to do it. At times, it is not easy to integrate with other components that could be a part of the entire solution in terms of development. There are issues with stability due to memory. It would be good to include a lock or an alarm to detect or alert you that it needs more resources. If the solution could be integrated more it would repair every part of the flow. Eventually, if we are integrated with other technology, we need to have a notion that we need to plan and have the correct sizing. We could implement an ecosystem that could scale with the requirements. In the next release, I would like to see the support of monitorization with the interface. Also an integrated development environment.