I believe Apache NiFi could be improved with easier, out-of-the-box provided monitoring solutions. While Apache NiFi has an API that generates logs, it would be beneficial to have simpler access to that data saved historically. It would assist in easily retrieving data for historical analysis and storing it elsewhere without the hassle of setting up APIs and delving into documentation. Just having a more streamlined approach to collecting this data would be greatly advantageous. I would suggest continuous improvements regarding the custom developer-built processors, as many times the errors that arise are not useful. We often seem to struggle with a combination of implementing our own error handling or analyzing logs, as the information does not always align or proves unhelpful. Continuous enhancement in this area would be wonderful, so we do not need to decipher which error is more accurate or which report gets us nearer to the actual problem. For instance, I encountered a situation where flow files would not process; they were retried but returned to the queue before the Python processor due to ambiguous errors. It eventually turned out that the issue was the flow files' size being too large for the Python processor, which we only discovered by splitting the flow files, at which point the issue resolved. The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.
Regarding improvements in Apache NiFi, there is scope; it should gear more towards an AI mechanism, especially for metadata generation. If I want to integrate Python code or embed Python code within the Apache NiFi parameters or workflows, it should come with AI integration that can assist in generating code based on user requirements. Gearing more towards a no-code mechanism will really enhance Apache NiFi's productivity, as customers would want that.About needed improvements, I think integration with other tools would really help in the age of AI. Apache NiFi should have APIs or connectors that can connect seamlessly to other external entities, whether in the cloud or on-premises, creating a plug-and-play mechanism.
Head of Data Engineering and AI Engineering at Coraline
Real User
Top 10
Apr 2, 2025
The logging system of Apache NiFi needs improvement. It is difficult to debug compared to Airflow ( /products/apache-airflow-reviews ), where task details and issues are clear. With Apache NiFi, I have encountered processes that die without any traceable error, which might relate to the inadequate logging system.
There are some areas for improvement, particularly with record-level tasks that take a bit of time. The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process. Enhancing features related to alerting would be helpful, including mobile alerts for pipeline issues. Integration with mobile devices for error alerts would simplify information delivery.
Engineering Lead- Cloud and Platform Architecture at a financial services firm with 1,001-5,000 employees
Real User
Oct 25, 2023
There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible. So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.
Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time. There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored. We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved. In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.
The use case templates could be more precise to typical business needs. Available templates and model workflows are very high-level so don't really match real needs. It would help to have templates that allow us to see business opportunities. It would help to be able to copy workflow to another device rather than having to ingest it.
The overall stability of this solution could be improved. In a future release, we would like to have access to more features that could be used in a parallel way. This would provide more freedom with processing.
Senior Solutions Architect/ Software Architect at a comms service provider with 51-200 employees
Real User
Dec 22, 2020
The challenge with Apache NiFi is that it's not cloud-native. This makes it different from our workflow. The operations are over-complicated and when you build your pipeline, it's a nightmare to follow them. Then, as your pipeline or workflow becomes more complex, the operation of it gets worse. It is not easy to use and it requires a bigger ramp-up than any other solution that we have seen.
There should be a better way to integrate a development environment with local tools. Most of the development is done on the console. For example, in Spark, we can develop on our local desktop and then deploy it to another environment. The integration using this solution is not simple. Eventually, we can implement a local environment on our machines that is web-oriented and we have a browsing console to do it. At times, it is not easy to integrate with other components that could be a part of the entire solution in terms of development. There are issues with stability due to memory. It would be good to include a lock or an alarm to detect or alert you that it needs more resources. If the solution could be integrated more it would repair every part of the flow. Eventually, if we are integrated with other technology, we need to have a notion that we need to plan and have the correct sizing. We could implement an ecosystem that could scale with the requirements. In the next release, I would like to see the support of monitorization with the interface. Also an integrated development environment.
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
I believe Apache NiFi could be improved with easier, out-of-the-box provided monitoring solutions. While Apache NiFi has an API that generates logs, it would be beneficial to have simpler access to that data saved historically. It would assist in easily retrieving data for historical analysis and storing it elsewhere without the hassle of setting up APIs and delving into documentation. Just having a more streamlined approach to collecting this data would be greatly advantageous. I would suggest continuous improvements regarding the custom developer-built processors, as many times the errors that arise are not useful. We often seem to struggle with a combination of implementing our own error handling or analyzing logs, as the information does not always align or proves unhelpful. Continuous enhancement in this area would be wonderful, so we do not need to decipher which error is more accurate or which report gets us nearer to the actual problem. For instance, I encountered a situation where flow files would not process; they were retried but returned to the queue before the Python processor due to ambiguous errors. It eventually turned out that the issue was the flow files' size being too large for the Python processor, which we only discovered by splitting the flow files, at which point the issue resolved. The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.
Regarding improvements in Apache NiFi, there is scope; it should gear more towards an AI mechanism, especially for metadata generation. If I want to integrate Python code or embed Python code within the Apache NiFi parameters or workflows, it should come with AI integration that can assist in generating code based on user requirements. Gearing more towards a no-code mechanism will really enhance Apache NiFi's productivity, as customers would want that.About needed improvements, I think integration with other tools would really help in the age of AI. Apache NiFi should have APIs or connectors that can connect seamlessly to other external entities, whether in the cloud or on-premises, creating a plug-and-play mechanism.
Apache NiFi is a very good tool, but there is room for improvement.
Apache NiFi is a good product as it is currently.
The logging system of Apache NiFi needs improvement. It is difficult to debug compared to Airflow ( /products/apache-airflow-reviews ), where task details and issues are clear. With Apache NiFi, I have encountered processes that die without any traceable error, which might relate to the inadequate logging system.
There are some areas for improvement, particularly with record-level tasks that take a bit of time. The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process. Enhancing features related to alerting would be helpful, including mobile alerts for pipeline issues. Integration with mobile devices for error alerts would simplify information delivery.
The tool should incorporate more tutorials for advanced use cases. It has tutorials for simple use cases.
There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible. So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.
Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time. There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored. We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved. In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.
More features must be added to the product. As compared to Kafka, the tool must be improved.
The use case templates could be more precise to typical business needs. Available templates and model workflows are very high-level so don't really match real needs. It would help to have templates that allow us to see business opportunities. It would help to be able to copy workflow to another device rather than having to ingest it.
The overall stability of this solution could be improved. In a future release, we would like to have access to more features that could be used in a parallel way. This would provide more freedom with processing.
The challenge with Apache NiFi is that it's not cloud-native. This makes it different from our workflow. The operations are over-complicated and when you build your pipeline, it's a nightmare to follow them. Then, as your pipeline or workflow becomes more complex, the operation of it gets worse. It is not easy to use and it requires a bigger ramp-up than any other solution that we have seen.
There should be a better way to integrate a development environment with local tools. Most of the development is done on the console. For example, in Spark, we can develop on our local desktop and then deploy it to another environment. The integration using this solution is not simple. Eventually, we can implement a local environment on our machines that is web-oriented and we have a browsing console to do it. At times, it is not easy to integrate with other components that could be a part of the entire solution in terms of development. There are issues with stability due to memory. It would be good to include a lock or an alarm to detect or alert you that it needs more resources. If the solution could be integrated more it would repair every part of the flow. Eventually, if we are integrated with other technology, we need to have a notion that we need to plan and have the correct sizing. We could implement an ecosystem that could scale with the requirements. In the next release, I would like to see the support of monitorization with the interface. Also an integrated development environment.