As a DevOps engineer, my day-to-day task is to move files from one location to another, doing some transformation along the way. For example, I might pull messages from Kafka and put them into S3 buckets. Or I might move data from a GCS bucket to another location. NiFi is really good for this because it has very good monitoring and metrics capabilities. When I design a pipeline in NiFi, I can see how much data is being processed, where it is at each stage, and what the total throughput is. I can see all the metrics related to the complete pipeline. So, I personally like it very much.
One example is how Apache NiFi has helped us to create data pipelines to migrate data from Oracle to Postgres, Oracle to Oracle, Oracle to Minio, or other databases, such as relational databases, NoSQL databases, or object storage. We create templates for these pipelines so that we can easily reuse them for different data migration projects. For example, we have a template for migrating data from Oracle to Postgres. This template uses an incremental load process. The template also checks the source and destination databases for compatibility and makes any necessary data transformations. If our data is not more than ten terabytes, then NiFi is mostly used. But for a heavy table setup, I don't use NiFi for customers or enterprise solutions.
Our company uses the solution to ingest raw data. We have five repositories with a huge amount of data. We normalize the data to previously structured files, prepare it, and ingest it to devices. The size of any project team depends on the workflow or management activities but typically includes two to five users.
Senior Technology Architect at a tech services company with 10,001+ employees
Real User
2021-03-18T01:24:20Z
Mar 18, 2021
I use Apache NiFi to build workflows. It's an event that is used for distributed messaging. You need to transfer the message that comes into Kafka Broker Topic. You get the messages in the Kafka queue topic, you transform it and send it to other entities for storage, or you can return it back to Kafka to send to the consumer.
The primary use case is to collect data from different source systems. This includes different soft types of files, such as text files, bin files, and CSV files, to name a few. It is also used for API training. It works for a large amount of data. It is oriented for endpoint solutions and high and low frequency in small packets of data, for example, files. It can also work well when integrated with Spark, they are complementary in some use cases. At times we work only with Nifi, at times only with Spark and other times when they are integrated.
Apache NiFi is an easy to use, powerful, and reliable system to process and distribute data. It supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.
We use the tool to transfer data from one service to another. It helps us to migrate data from one department to another.
As a DevOps engineer, my day-to-day task is to move files from one location to another, doing some transformation along the way. For example, I might pull messages from Kafka and put them into S3 buckets. Or I might move data from a GCS bucket to another location. NiFi is really good for this because it has very good monitoring and metrics capabilities. When I design a pipeline in NiFi, I can see how much data is being processed, where it is at each stage, and what the total throughput is. I can see all the metrics related to the complete pipeline. So, I personally like it very much.
One example is how Apache NiFi has helped us to create data pipelines to migrate data from Oracle to Postgres, Oracle to Oracle, Oracle to Minio, or other databases, such as relational databases, NoSQL databases, or object storage. We create templates for these pipelines so that we can easily reuse them for different data migration projects. For example, we have a template for migrating data from Oracle to Postgres. This template uses an incremental load process. The template also checks the source and destination databases for compatibility and makes any necessary data transformations. If our data is not more than ten terabytes, then NiFi is mostly used. But for a heavy table setup, I don't use NiFi for customers or enterprise solutions.
We use the solution for data streaming.
Our company uses the solution to ingest raw data. We have five repositories with a huge amount of data. We normalize the data to previously structured files, prepare it, and ingest it to devices. The size of any project team depends on the workflow or management activities but typically includes two to five users.
I use Apache NiFi to build workflows. It's an event that is used for distributed messaging. You need to transfer the message that comes into Kafka Broker Topic. You get the messages in the Kafka queue topic, you transform it and send it to other entities for storage, or you can return it back to Kafka to send to the consumer.
The primary use case is to collect data from different source systems. This includes different soft types of files, such as text files, bin files, and CSV files, to name a few. It is also used for API training. It works for a large amount of data. It is oriented for endpoint solutions and high and low frequency in small packets of data, for example, files. It can also work well when integrated with Spark, they are complementary in some use cases. At times we work only with Nifi, at times only with Spark and other times when they are integrated.