Try our new research platform with insights from 80,000+ expert users
Lead of Monitoring Tech at a educational organization with 1,001-5,000 employees
Real User
Top 20
A good tool for managing data pipelines
Pros and Cons
  • "Since Apache works very well on Python, we can manage everything and create pipelines there."
  • "Adding more automated components in Apache Airflow for basic things like exporting the data would be helpful."

What is our primary use case?

We use Apache Airflow to send our data to a third-party system.

What is most valuable?

We are already on Python. Since Apache works very well on Python, we can manage everything and create pipelines there.

What needs improvement?

Adding more automated components in Apache Airflow for basic things like exporting the data would be helpful. Apache Airflow is not that easy to use, but we have gotten used to it.

For how long have I used the solution?

I have been using Apache Airflow for three years.

Buyer's Guide
Apache Airflow
March 2025
Learn what your peers think about Apache Airflow. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
842,767 professionals have used our research since 2012.

What do I think about the stability of the solution?

Apache Airflow is a stable solution.

What do I think about the scalability of the solution?

Apache Airflow is not a scalable solution for our use cases. We have a very huge list of use cases. Over 10 developers use Apache Airflow in our organization.

How are customer service and support?

Apache Airflow's technical support team is good and provides assistance almost 90% of the time.

How was the initial setup?

Apache Airflow's initial setup is easy. It's not that difficult, but it has a learning curve.

What's my experience with pricing, setup cost, and licensing?

Apache Airflow is a cheap solution.

What other advice do I have?

Depending on your use case, if you are looking for a quick solution to work on and know Python, you should go ahead with Apache Airflow.

Apache Airflow is a good enough tool for managing data pipelines. However, the solution is not up to the mark as you scale up and go at the higher performance. Apache Airflow has introduced the DAG connector for managing data pipelines.

Overall, I rate Apache Airflow an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
reviewer2108010 - PeerSpot reviewer
Associate Data Engineer at a outsourcing company with 201-500 employees
MSP
Connects to everything we need, but doesn't support development through the UI
Pros and Cons
  • "Development on Apache Airflow is really fast, and it's easy to use with the newer updates. Everything is in Python, so it's not hard to understand. They also have a graphical view, so if you are not a programmer and you are just an administrator, you can easily track everything and see if everything is working or not."
  • "Programmatically, it's very good, and it doesn't have any competitors, but you cannot develop anything in Airflow UI. You need to develop everything within the program. In the market, other tools have come up recently as competitors to Airflow, and they also give graphical programming options, whereas Airflow doesn't provide that feature currently. All the DAGs you want to build need to be coded in Python."

What is our primary use case?

We were using Apache Airflow for our orchestration needs. We used it for all the jobs that we had created in Databricks, Fivetran, or dbt. These were the three primary tools that we were using. There were a few others, but these were the three primary tools. So, Apache Airflow was for the job orchestration and connecting them to each other for building our entire data pipeline. We were also using Apache Airflow for dbt CI/CD purposes.

What is most valuable?

The most valuable feature is that it's the most popular data orchestration tool in the market right now. It connects to everything you need.

It's open-source. You have a lot of documentation and a lot of people helping out. It has large communities, so if you need something or you want to ask something, you can. Often, someone else would have already asked that question, and they would have already got the answer, and you can just look it up.

Development on Apache Airflow is really fast, and it's easy to use with the newer updates. Everything is in Python, so it's not hard to understand. They also have a graphical view, so if you are not a programmer and you are just an administrator, you can easily track everything and see if everything is working or not. For notifications, it can connect with different messaging tools such as Slack and Teams, as well as with webhooks. It's very easy to use, and it has a lot of features that you would expect from any of the data orchestration tools.

What needs improvement?

Programmatically, it's very good, and it doesn't have any competitors, but you cannot develop anything in Airflow UI. You need to develop everything within the program. In the market, other tools have come up recently as competitors to Airflow, and they also give graphical programming options, whereas Airflow doesn't provide that feature currently. All the DAGs you want to build need to be coded in Python. It doesn't provide features for graphical programming. You cannot drag and drop something, build a pipeline out of that, or orchestrate that with a drag and drop. They have a graphical feature but only for administration purposes, not for development. They don't have a UI for development.

It doesn't support the Windows system. That's a big drawback because a lot of people are using Windows. 

For how long have I used the solution?

I used Apache Airflow on my previous project. We had planned to use it in our current project, but due to time issues, we were not able to deploy it. In my previous project, I used it for around eight or nine months.

What do I think about the stability of the solution?

It's a very stable product.

What do I think about the scalability of the solution?

It's highly scalable. You can scale it as much as you want. It depends on the size, and you need to scale up your instance. We had over 3,000 DAGs in our previous project, and we didn't face any issue with even 8 GB memory in our EC2 instance. If you have a lot of DAGs, you might need to scale up, but it's quite lightweight, so you don't need to worry much about that.

How are customer service and support?

It's open source. It was my first project, and I had a few doubts, but everything I needed was available on the internet, so I never had to contact their support. I might have been able to post my questions on their GitHub, but I didn't need that. Airflow has a very large community, so any questions you ask get answered there.

How was the initial setup?

Its setup wasn't done by us. It was done by the Astronomer team on Azure Community Services. So, it was deployed and set up on Azure Community Service. Everything was taken care of by the Astronomer team.

What about the implementation team?

Apache Airflow has two large and popular distributors. There might be others, but the two popular ones are Bitnami and Astronomer. For us, everything was set up by Astronomer.

What's my experience with pricing, setup cost, and licensing?

It's open source. You can install it locally on your own system. If you are deploying it in the production system, you normally deploy it on some cloud, such as EC2 service, which would have some cost. If you are setting up a Docker container or something for Apache Airflow yourself, which is quite easy, you can do pretty much everything online. I have set it up on my local system, and It doesn't take a long time. You can do customization for your project such as selecting different repository databases or selecting different cellular or web services, which is good.

If you are going with a service provider such as Astronomer or Bitnami, they will charge you because they are a distributor of Airflow. They have some of their own features and their own support. They will charge you if you are going with them.

What other advice do I have?

If you are on a Mac or Linux system, it's very easy to install. You can just go to the Apache website to install it, and you can start working, but Apache Airflow doesn't support Windows Exe installation, so if you have some knowledge of Docker containers for WSL, it'll be useful.

Other than that, Astronomer has an instructor called Marc Lamberti who is very popular in the Airflow community. He has YouTube videos. In five minutes, he can teach you how to set up Airflow or what DAGs are. He has five or six videos, and he gets into the details with his videos. So, if you have no idea about Apache Airflow and you don't want to go through all the documentation, you can start with those videos, but if you have a Mac or Linux system, you can directly install it on your system.

I'd rate it a seven out of ten because it doesn't support Windows, and it doesn't support graphical designing, so we cannot create DAGs in the UI. We can administer and look at DAGs through the UI, but we cannot create DAGs through the UI. Other orchestration tools that are available in the market provide that feature.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Airflow
March 2025
Learn what your peers think about Apache Airflow. Get advice and tips from experienced pros sharing their opinions. Updated: March 2025.
842,767 professionals have used our research since 2012.
VenugopalKathirvel - PeerSpot reviewer
Senior Member Of Technical Staff, Engineering Operations at VMware
Real User
Flexible open-source solution
Pros and Cons
  • "Apache Airflow's best feature is its flexibility."
  • "Apache Airflow could be improved with the addition of more frameworks."

What is most valuable?

Apache Airflow's best feature is its flexibility.

What needs improvement?

Apache Airflow could be improved with the addition of more frameworks.

For how long have I used the solution?

I've been using Apache Airflow for four years.

What do I think about the stability of the solution?

Apache Airflow is stable.

What do I think about the scalability of the solution?

Apache Airflow is scalable.

How was the initial setup?

The initial setup was very easy.

What about the implementation team?

We used an in-house team.

What's my experience with pricing, setup cost, and licensing?

Apache Airflow is open-source and free of charge.

What other advice do I have?

I would rate Apache Airflow eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Solution Architect at EPAM Systems
Real User
Top 5Leaderboard
Simple to automate using Python, but code does not cover all data warehousing tasks
Pros and Cons
  • "This is a simple tool to automate using Python."
  • "We need to develop our workflow description and notations because out of the box, Apache Airflow does not provide some features that are needed."

What is our primary use case?

The primary use case for this solution is to automate ETL process for datawarehouse.

What is most valuable?

The most valuable feature is the UI, for automation.One can monitor all ETL processes in single screen. Complex workflows are shown as DAGs SVG images.

This is a simple tool to automate using Python.

What needs improvement?

There are some drawbacks to this solution. The code does not cover all tasks in the data warehouse automation process.  Currently , in production, we have a large installation with a complex workflow that includes hundreds of tasks. Most of them are dispatched by existing engine, but not all.
For example, sometimes we need to create cycles in our workflow but we are not able to, because Airflow supports only Direct Acyclic Graphs ( DAGs )

We need to develop our workflow description and notations because out of the box, Apache Airflow does not provide some features that are needed. It is our understanding that it is limited by design.

We will wait for the latest 2.0 version, as it is awaited to be much more mature than the 1.8-1.10 version. We believe that it will be better.

There should be some improvement made to the Doc Management features from within the UI. They should think about Outlook integration, which should be out of the box, and the object model should be expanded to support cyclic graphs inside the workflow.

For how long have I used the solution?

We have been using this solution for eighteen months.

What do I think about the stability of the solution?

This solution is not very stable. There are a number of configurations issues.

What do I think about the scalability of the solution?

This solution is scalable. We use this solution in a single node, but it is possible to have a  cluster of workers.

It can be used for one or two thousand related tasks and should be done in a cluster configuration. 

We don't use a cluster, rather we only use single nodes. It is sufficient for our tasks. Tasks are long and the parallelism is limited by the database engine, and not by the workflow engine. 

We would like to evaluate clusters in the future.

We are using the Cron Task scheduling feature for Apache Airflow. Users can configure the Apache Airflow themselves. There are up to ten users that can configure Apache Airflow.

This is a part of the wage solution, and it is the initial point of the wage slot process. The wage solution has hundreds of users.

How are customer service and technical support?

We don't use any paid technical support, as it is an open-source solution. We have used Stack Overflow and other open information sources, but we know that some companies provide technical support. 

As we have studied their solutions that are available on the internet, it is my understanding, that, we are on a pretty high level and could provide commercial support ourselves. 

We don't use any support from commercial companies, but some very useful recent solutions we could extract from Apache Airflow GitHub, as an example.  

Which solution did I use previously and why did I switch?

Previously, we used Control-M for a short period. It was a solution used by our customers, and we needed to understand their difficulties and the results. 

For low to middle scaled tasks, Apache Airflow could be a substitute for Control-M

How was the initial setup?

The deployment model we used was through a private cloud. It was a private installation on Google Cloud.

What about the implementation team?

In-house team.

What was our ROI?

It 's measured jut now. Precise data is awaited in 3..4 months. First conclusion - positive ROI

What's my experience with pricing, setup cost, and licensing?

There are no costs associated with this solution. Apache Airflow is a free solution that can be downloaded and ready for use at any moment.

Which other solutions did I evaluate?

Our tasks can be automated by simple Jenkins, but our customer wanted to implement it on Apache Airflow. This was a solution used by our customer.

Apache Airflow is mainstream and everyone wants to use it. Google provides Apache Airflow as part of the Cloud services.

What other advice do I have?

My advice would be to use this solution for simple tasks. 

They should have a Python expert for features that are not available out of the box, as it is not enough. 

It could be a good solution for enterprise workflow automation and solutions like Control-M within the next two to three years.

We are happy and satisfied with this solution, but not fully satisfied, as this solution has some positive and negative aspects.

I would rate this solution a seven out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Luiz Cesar Gosi - PeerSpot reviewer
Senior Analytics Engineer at TalkDesk
Real User
Top 5Leaderboard
A useful tool for data orchestration and collecting information
Pros and Cons
  • "The solution's UI allows me to collect all the information and see the code lines."
  • "I have some issues with the solution's communication."

What is our primary use case?

We use Apache Airflow for data orchestration.

What is most valuable?

Apache Airflow is a pretty useful tool for collecting information. Apache Airflow is a pretty easy solution that can be used with Python. The solution's UI allows me to collect all the information and see the code lines.

What needs improvement?

I have some issues with the solution's communication. The solution uses the same database or data set. Sometimes, we consume the same data and send it to a different place when doing a different DAG. When using the UI, I want to see that we use the same data set more than once.

For how long have I used the solution?

I have been using Apache Airflow for five years.

What do I think about the stability of the solution?

I rate Apache Airflow a seven out of ten for stability.

What do I think about the scalability of the solution?

I rate Apache Airflow an eight out of ten for scalability. Around 400 users are using the solution in our organization.

Which solution did I use previously and why did I switch?

I previously used Control-M and some AWS and Google Cloud Platform tools.

How was the initial setup?

Apache Airflow's initial setup is pretty straightforward. Apache Airflow is quite intuitive to set up and create DAGs.

What about the implementation team?

It takes around two days to deploy Apache Airflow. A DAG can be created in just a few hours.

What other advice do I have?

Apache Airflow is deployed on-cloud in our organization.

Overall, I rate Apache Airflow a nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Fadi Bathish - PeerSpot reviewer
Project Manager at Siren Analytics
Real User
Very stable, easy to learn, and quite configurable
Pros and Cons
  • "The solution is quite configurable so it is easy to code within a configuration kind of environment."
  • "The dashboards could be enhanced."

What is our primary use case?

We use this solution to monitor BD tasks.

What is most valuable?

The solution is quite configurable so it is easy to code within a configuration kind of environment. 

The ease of learning and using the solution is quite good. The learning curve is low so new users can learn in a short period of time in comparison to other products. 

What needs improvement?

The following should be improved:

  • Dashboards
  • Security
  • Telemetry for logging, monitoring, and alerting purposes
  • Documentation 

For how long have I used the solution?

I have used the solution for six months. 

What do I think about the stability of the solution?

The solution is 99% stable. We have a few glitches here and there but have been able to fix them. 

What do I think about the scalability of the solution?

The solution is quite scalable. You can grow in terms of users and environment. You can grow to multi-server applications. You can use the solution on desktops, mobile, or other devices. 

How are customer service and support?

We have an internal tech support team so have not needed support from the vendor. 

How was the initial setup?

The setup is straightforward. The time for deployment depends on the environment and user base.

What about the implementation team?

We implement the solution in-house. We have one implementation with 60 users and another with 75 users. 

We have a tech support team that consists of ten engineers who support implementations. They follow up on issues that might arise during the process automation or implementation of the workflow itself. 

For example, our tech support team will resolve a workflow that gets stuck during the MDM workflow engine. The tech team has the knowledge base to resolve any of these issues. 

What's my experience with pricing, setup cost, and licensing?

The solution is open source.

What other advice do I have?

I do not have exposure to use cases for large organizations with a huge user environment, so I cannot speak to the solution's effectiveness in these scenarios. 

I rate the solution an eight out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Anandhavelu Arumugam - PeerSpot reviewer
Technical Lead at a media company with 5,001-10,000 employees
Real User
Useful for scheduling purposes but should include no-code capabilities
Pros and Cons
  • "It's stable."
  • "I would like to see some no-code capabilities and drag and drop abilities in Airflow."

What is our primary use case?

I use this solution for scheduling purposes. We have our own Python framework to run jobs, do the extractions, and for transformation loading.

We have 20 people who are using Airflow. It's being used on a daily basis. We don't have any plans to increase usage because we have low data sets.

The solution is deployed on cloud. The cloud provider is Azure.

What needs improvement?

Everything is in the Python framework now. I would like to see some no-code capabilities and drag and drop abilities in Airflow.

We're expecting a few more improvements in the log generator. Currently, it's very clumsy.

For how long have I used the solution?

I have used Apache Airflow for three years.

What do I think about the stability of the solution?

It's stable.

What do I think about the scalability of the solution?

It's scalable. So far, we haven't needed more scalability because it's totally controlled by administrators.

Which solution did I use previously and why did I switch?

The only difference between Apache Airflow and BPM software is the pricing.

How was the initial setup?

Setup is about medium difficulty. You need to have some prior knowledge and experience with docker containers and AKS.

What's my experience with pricing, setup cost, and licensing?

It's open-source.

What other advice do I have?

I would rate this solution as seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Mahendra Prajapati - PeerSpot reviewer
Senior Data Analytics at a media company with 1,001-5,000 employees
Real User
A customizable solution, but the integration process could be simplified
Pros and Cons
  • "The best feature is the customization."
  • "The solution could be improved by simplifying the integration process."

What is our primary use case?

Our primary use case for this solution is scheduling task rates. We capture the data from the SQL Server location and migrate it to the central data warehouse.

What is most valuable?

The best feature is the customization that can be done using Python. For example, there are use cases where we have to tweak the algorithm and with Apache Script Rate, we have extra functionality that helps to change the underlying process. We can define our algorithms and processes using Python.

What needs improvement?

The solution could be improved by simplifying the integration process and providing access to its support team to guide integration.

For how long have I used the solution?

We have been using this solution for two months and it is deployed on-premises.

What do I think about the stability of the solution?

The solution is stable but primarily depends on the support team and how they manage it.

What do I think about the scalability of the solution?

Apache Airflow is scalable. Approximately 20 people use this solution on my team.

How are customer service and support?

We haven't had any experience with customer service and support.

Which solution did I use previously and why did I switch?

Previously, we were using SQL server integration tools and integration service SSIS packages. We had project orders and wanted to migrate everything as it was an open source rate and no license was required. We switched to Apache Flow because we are trying to migrate all the projects developed in SSIS using Python.

How was the initial setup?

The initial setup was straightforward. However, if a script is written, it takes four to five minutes to set up.

What's my experience with pricing, setup cost, and licensing?

Apache Airflow is open source, so I cannot comment on licensing costs.

Which other solutions did I evaluate?

We chose this solution because it was suitable for our business needs.

What other advice do I have?

I rate this solution a seven out of ten. My advice to new users is to have good proficiency with Python language. The solution is good but can be improved by simplifying its integration process.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Apache Airflow Report and get advice and tips from experienced pros sharing their opinions.
Updated: March 2025
Buyer's Guide
Download our free Apache Airflow Report and get advice and tips from experienced pros sharing their opinions.