Informatica IDMC and Apache Airflow are leading players in the data management and workflow orchestration market. Informatica IDMC holds an upper hand in MDM capabilities and comprehensive feature offerings, while Apache Airflow excels in flexibility due to its open-source nature.
Features: Informatica IDMC stands out with its robust data integration, quality management, and flexible task-based workflow engine. Key features include data cleansing and validation, integration with PowerCenter, and support for flexible architectures. Apache Airflow is recognized for its flexibility in defining workflows programmatically using Python. It provides powerful orchestration of data pipelines and is highly adaptable due to its open-source nature.
Room for Improvement: Informatica IDMC could enhance integration, particularly with SAP, introduce preconfigured business rules, and improve user stewardship and reporting interfaces. Apache Airflow faces challenges with handling cyclic workflows and state management, requiring external state repositories, and needs better documentation and support for more programming languages.
Ease of Deployment and Customer Service: Informatica IDMC provides versatile deployment options across on-premises, hybrid, and public cloud environments, with responsive customer service although sometimes delayed. Apache Airflow, while easily deployable, heavily relies on user expertise and community-driven support, lacking formal customer service like Informatica.
Pricing and ROI: Informatica IDMC is noted for higher pricing due to its extensive features and scalability, targeted at large enterprises, potentially limiting small business adoption. Apache Airflow benefits from a cost advantage as an open-source solution, eliminating licensing fees and appealing to budget-conscious organizations, though ROI depends on effective open-source utilization.
Forums and community resources like Stack Overflow are helpful.
There is enough documentation available, and the community support is good.
Due to the tool's maturity limitations, solutions are not always simple and often require workarounds.
The solution is very scalable.
Apache Airflow scales well, especially when deployed in Kubernetes environments.
As a SaaS platform, IDMC is quite scalable and provides complete flexibility.
I would rate the stability of the solution as ten out of ten.
Apache Airflow is stable and I have not experienced significant issues.
Stability is crucial because IDMC holds business-critical data, and it needs to be available all the time for business users.
It is not suitable for real-time ETL tasks.
There is no dashboard for us to check all the Directed Acyclic Graphs (DAGs); a dashboard would help us analyze the work better.
The tool needs to mature in terms of category-specific attributes or dynamic attributes.
I prefer using the open-source version rather than the enterprise version, which helps manage costs.
Apache Airflow is a community-based platform and is not a licensed product.
IDMC is often described as the 'Ferrari of Master Data Solutions,' implying that while expensive, it is business-critical and, therefore, justified.
Apache Airflow is an open-source platform that allows easy integration with AWS, Azure, and Google Cloud Platform.
Reliability is good, and when integrated with Kubernetes, it performs better compared to on-premises environments.
The platform's ability to pull in data from other platforms without the need for an additional integration tool enhances its appeal.
Apache Airflow is an open-source workflow management system (WMS) that is primarily used to programmatically author, orchestrate, schedule, and monitor data pipelines as well as workflows. The solution makes it possible for you to manage your data pipelines by authoring workflows as directed acyclic graphs (DAGs) of tasks. By using Apache Airflow, you can orchestrate data pipelines over object stores and data warehouses, run workflows that are not data-related, and can also create and manage scripted data pipelines as code (Python).
Apache Airflow Features
Apache Airflow has many valuable key features. Some of the most useful ones include:
Apache Airflow Benefits
There are many benefits to implementing Apache Airflow. Some of the biggest advantages the solution offers include:
Reviews from Real Users
Below are some reviews and helpful feedback written by PeerSpot users currently using the Apache Airflow solution.
A Senior Solutions Architect/Software Architect says, “The product integrates well with other pipelines and solutions. The ease of building different processes is very valuable to us. The difference between Kafka and Airflow, is that it's better for dealing with the specific flows that we want to do some transformation. It's very easy to create flows.”
An Assistant Manager at a comms service provider mentions, “The best part of Airflow is its direct support for Python, especially because Python is so important for data science, engineering, and design. This makes the programmatic aspect of our work easy for us, and it means we can automate a lot.”
A Senior Software Engineer at a pharma/biotech company comments that he likes Apache Airflow because it is “Feature rich, open-source, and good for building data pipelines.”
Informatica Intelligent Data Management Cloud (IDMC) is a robust platform used by banks, financial institutions, and health sector organizations for data management, governance, and compliance.
IDMC provides comprehensive tools for data discovery, profiling, masking, and transformation. It supports Salesforce integration, real-time data streaming, and scalable data management solutions. Health organizations manage national product catalogs while financial entities focus on data protection and regulatory compliance. Its intuitive interface, flexible features, and robust tools make it valuable across sectors, though enhancements in data integration and human workflow are being sought.
What are the most important features?
What benefits and ROI should be considered?
Banks and financial institutions use IDMC for data masking, transformation, and compliance, while health sector organizations leverage it for national product catalogs. Industry applications focus on automating business processes, centralizing data, and managing data catalogs to meet regulatory demands and ensure data protection.
We monitor all Business Process Management (BPM) reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.