We use the Azure Data Lake to store raw customer files exported from their databases. Our pipelines then pick up this data and process it in various ways. For instance, we use Databricks to handle the data processing, transformation, and ETL tasks. The processed data is then stored in SQL Server or converted into other file formats.
VP - Global Delivery Head at Enhops
Has security access policies and helps to store customer data
Pros and Cons
- "The tool's most valuable feature is security access policies."
- "The solution needs to improve APIs and make them more accessible."
What is our primary use case?
What is most valuable?
The tool's most valuable feature is security access policies.
What needs improvement?
The solution needs to improve APIs and make them more accessible.
What do I think about the scalability of the solution?
The tool is scalable. We currently manage around 200 GB of data, and recently, we've optimized and synchronized additional data, which is approximately 500 GB. My company has 1000-2000 users who use it weekly.
Buyer's Guide
Azure Data Lake Storage
December 2024
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
How are customer service and support?
We've had to seek help several times, particularly with Power BI and Databricks integrations. However, the initial level of support hasn't always been as knowledgeable as we'd like. They usually gather information and then schedule follow-up appointments, which can take a few days. Improving the expertise and responsiveness of the first-level support team could speed up the resolution process.
How was the initial setup?
We use templates for deployment and automate the process.
What's my experience with pricing, setup cost, and licensing?
The tool's licensing model is pay-as-you-go. Regarding pricing, there's always competition between Azure Data Lake Storage and AWS. They're quite similar, but to attract more customers, Azure Data Lake Storage could consider adjusting its pricing to be more competitive with AWS. This might make Azure Data Lake Storage a more appealing option for users.
What other advice do I have?
The tool is the best platform for storing all kinds of data; I've never experienced any downtime with it. Plus, it offers secure access and security features, which I appreciate. I rate the product a nine out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Apr 17, 2024
Flag as inappropriateCloud Architect at Devies Cloud & Engineering
Offers good safety to users and is impossible to hack into
Pros and Cons
- "The tool is very safe to use. It is impossible to hack the product."
- "If tools like Azure Data Lake Storage are enabled within the tool named Azure Storage Explorer, then it would be of tremendous help, but it can be really tricky."
What is our primary use case?
From my perspective, it is secure transfer storage out there.
In terms of the use of Azure Data Lake Storage by customers for data analytics and processing workflows, I would say that my role is to convince customers that it is the safest tool for the storage of data. You can securely connect to remote regions with the tool Azure Storage Explorer, which gives the options and possibilities to safely transfer your data from your existing storage premises and send it to Azure Cloud.
What is most valuable?
The solution's most valuable features are the tools and functions, which are primarily hosted in Azure Storage Explorer. However, you can also facilitate them from within the backup. The tool is very safe to use. It is impossible to hack the product.
What needs improvement?
Some customers residing in former Eastern European countries operate in an independent and very weak IT environment. If tools like Azure Data Lake Storage are enabled within the tool named Azure Storage Explorer, then it would be of tremendous help, but it can be really tricky. It would be great if some of the aforementioned features could be enabled, but I fully understand the complexities involved.
I used to like Azure Data Lake Storage previously. Presently, I like the fact Azure Data Lake Storage is improving rapidly by investing and honestly in assets, resources, top personnel, along with a lot of money for making Azure's storage part a bigger concept. Azure Data Lake Storage can be a danger for the large storage products.
For how long have I used the solution?
I have been using Azure Data Lake Storage for a couple of years. I am an Azure solution architect. I work with Azure Data Lake Storage Gen2.
What do I think about the stability of the solution?
I am very confident that it is a stable solution. Stability-wise, I rate the solution a ten out of ten.
There was a major issue during mid-October, which affected many global businesses just for a few hours. It was the biggest issue with the tool I had been involved in for many years.
What do I think about the scalability of the solution?
Scalability-wise, I rate the solution a ten out of ten.
Azure Lake Dade Data Lake Storage scalability has very much impacted our customer's data storage strategy since it offers options to choose the disk, scale-out, and DRC options, making everything fantastic.
All customers I have worked with over the last year are using the tool and assigning me to look after it, so it could be seven or eight businesses over the last three years, some of which are global leaders in the market, having over 12,000 employees globally.
How was the initial setup?
On a scale of one to ten, where one is difficult and ten is easy, I rate the product's initial setup phase as ten. You have to understand what to do since you can be lucky and just go and click a few buttons to do the setup process. Knowing the tool's setup phase can make the product cost-effective, but if you don't know about it, then it can be costly. Combining the tool with the features of Azure Cost Management can make things much easier for you. The upcoming edition of Microsoft Copilot should make everything in the tool way easier and also other things not so expensive.
I was not directly involved in the product's deployment process, but I am subscribed to all channels associated with the deployment part, and I have many friends in Microsoft in Northern Europe, and in Sweden, where I live. Storage is one of my top skills, and my friends want to help me become a champion.
The solution is deployed on the cloud and in the on-premises version.
What's my experience with pricing, setup cost, and licensing?
From one to ten, where one is cheap and ten is expensive, I rate the product price as five. It costs money, but it is cheaper by at least thirty percent if you consider the other equivalent solutions from AWS. Considering the aforementioned perspective, the tool is cheap, but you have to pay a certain amount.
There are options to choose from depending on the subscription you have and the amount of features you consume from Microsoft, so it can vary quite a bit.
What other advice do I have?
For my organization, the most valuable part of the tool stems from a variety of features within SQL tables and also data, which is a combination of Azure Data Lake Storage and Azure Blob Storage. Azure Data Lake Storage Gen2 is the best, and I think it is fantastic. From my point of view, the tool is considered to be very competitive here.
I have been working with storage tools for over twenty years, so I am developing my skill sets related to cloud solution providers, mainly on Azure since I began with that in 2008.
Take a deep dive into all the possibilities and options you get because you won't be disappointed. You need to do comparisons with other CSPs and other storage vendors, like NetApp and Dell EMC.
I rate the tool a ten out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
Disclosure: My company has a business relationship with this vendor other than being a customer: consultant
Last updated: Jun 23, 2024
Flag as inappropriateBuyer's Guide
Azure Data Lake Storage
December 2024
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
824,067 professionals have used our research since 2012.
Senior Manager at IT Squad
A cost-effective solution to store data and allows flexible capacity management
Pros and Cons
- "The version was a bit outdated compared to the newer Microsoft Data Fabric offerings."
What is our primary use case?
We use the solution for storing data but don’t use Synapse to store data directly in it. Instead, Azure Synapse Analytics is utilized to analyze and process data in Data Lake Storage. Data Lake Storage is a large, scalable solution that handles extensive volumes of structured and unstructured data rather than a direct disk storage system.
What needs improvement?
In Azure Data Lake Storage, the tool we're using, Spark, handles the management, storage, retrieval, and organization of data. Spark employs its algorithms to abstract the underlying complexities. We don’t work with a large amount of data. If we were to handle larger datasets, we would need to focus more on optimizing storage and retrieval processes, as the efficiency of these operations would become more critical.
The version was a bit outdated compared to the newer Microsoft Data Fabric offerings. For instance, the directory services are already available in Data Fabric, so I don't think adding them to Azure Data Lake Storage would be necessary. For example, Snowflake, a cloud data analytics platform, adds its capabilities and optimizations to Azure Data Lake Storage, such as improved performance or easier integration with SQL. Compared to other similar services, Azure Data Lake Storage remains very competitive.
For how long have I used the solution?
I have been using Azure Data Lake Storage for over a year.
What do I think about the stability of the solution?
Azure is a stable platform. These interruptions are relatively rare and usually last only a few minutes. It is good for data-oriented applications that don’t require continuous online processing.
These brief outages do not significantly impact the quality of service. We haven’t experienced major stability issues with Azure Storage.
What do I think about the scalability of the solution?
It is scalable.
How are customer service and support?
Any issues are handled by the team responsible for managing the platform.
Which solution did I use previously and why did I switch?
We primarily use Azure Synapse, which integrates with Azure Data Lake Storage. Synapse leverages the storage provided by Data Lake Storage, so both are part of the Azure ecosystem but remain distinct services.
Another integration involves SQL Server, which serves data to various consumers as an SQL database. The main consumer is Power BI, which provides extensive reporting capabilities. Additionally, Azure Functions integrates with internal systems at the client’s end.
What's my experience with pricing, setup cost, and licensing?
It is a cost-effective solution.
What other advice do I have?
Using a cloud platform generally allows for flexible capacity management, meaning you can use and pay for resources only when needed. This is particularly useful for our customers, who can run Spark clusters in serverless mode. They only pay for the time they use the service, which is cost-effective since they don’t need constant access to high power and typically run jobs for shorter periods, like half an hour.
It is available continuously and supports data archiving. However, since the current volume of data is not large, the client doesn’t need to focus on archiving or optimization. As their data grows and becomes more historical, they may need to optimize storage and archiving practices.
The other team manages the integration tasks. The process is straightforward as long as the systems, functions, or other components interact with external systems. The ease of integration can depend on the intensity of the integration requirements.
Overall, I rate the solution an eight out of ten.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Aug 28, 2024
Flag as inappropriateProduct Manager at AfroUrembo
Simple to configure and set up
Pros and Cons
- "The solution's most valuable feature is its simplicity of configuration and setup."
- "Simple migrations from one data lake to another take too much time and could be improved."
What is our primary use case?
We use the solution for the default data platform situation.
What is most valuable?
The solution's most valuable feature is its simplicity of configuration and setup. It also has good scalability, allowing you to add more data.
What needs improvement?
Simple migrations from one data lake to another take too much time and could be improved.
For how long have I used the solution?
I have been using Azure Data Lake Storage for a couple of months.
What do I think about the stability of the solution?
The solution’s stability is very good.
I rate the solution’s stability a nine out of ten.
What do I think about the scalability of the solution?
Our clients for Azure Data Lake Storage are usually enterprise businesses.
I rate the solution’s scalability a nine out of ten.
How was the initial setup?
The solution’s deployment takes a few weeks.
On a scale from one to ten, where one is difficult and ten is easy, I rate the solution's initial setup an eight out of ten.
What's my experience with pricing, setup cost, and licensing?
On a scale from one to ten, where one is cheap and ten is expensive, I rate the solution's pricing a seven out of ten.
What other advice do I have?
Azure Data Lake Storage has slightly impacted the speed of data access. The solution's integration capability is very good, and I rate it an eight out of ten. All AI runs on data, and it has to be stored. It is usually stored in such an environment. I would recommend the solution to other users because it has good price quality.
Overall, I rate the solution an eight out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
Last updated: Aug 28, 2024
Flag as inappropriateSenior Solutions Architect at Think Power Solutions
Offers effective data management with scalable cloud storage but needs more AI features
Pros and Cons
- "The most valuable feature is the scalability due to the blob, which allows us to scale our data effectively."
- "The most valuable feature is the scalability due to the blob, which allows us to scale our data effectively."
- "It would be beneficial if some AI features were added."
What is our primary use case?
We are using the solution for call monitoring and connecting Data Lakes. We have different data at various locations, both structured and unstructured, which we use for analytics.
What is most valuable?
The most valuable feature is the scalability due to the blob, which allows us to scale our data effectively. Data is stored on the blob and Lake, maintaining connectivity through the Data factory. We can write scripts on the Data factory to handle any type of data and store it on the cloud. This has saved us time.
What needs improvement?
We have not explored the AI features. It would be beneficial if some AI features were added.
For how long have I used the solution?
We have been using the solution for a while now.
What do I think about the stability of the solution?
We have not encountered any stability issues so far.
What do I think about the scalability of the solution?
The solution is scalable, which allows us to manage our data effectively. Due to the blob, we can scale our data.
How are customer service and support?
We handle problems internally and do not need to contact customer service.
Which solution did I use previously and why did I switch?
We created our solution independently and did not use others.
How was the initial setup?
The setup was straightforward, and we were able to deploy our solution on Azure without any problems.
What about the implementation team?
We implemented the solution ourselves.
What other advice do I have?
Exploring more AI features might enhance the solution.
I'd rate the solution seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Nov 26, 2024
Flag as inappropriateEnterprise Architect at a non-profit with 501-1,000 employees
Able to partition data into various datasets using a directory hierarchy
Pros and Cons
- "The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily."
- "One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity."
What is most valuable?
The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily.
What needs improvement?
One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity.
For how long have I used the solution?
I have been using the product for a year.
What do I think about the stability of the solution?
Stability is good if you build your Azure Data Lake Storage well in the first place.
What do I think about the scalability of the solution?
Scalability depends on process complexity—it is high for simple processes and low for complex ones. This is due to the architecture of a data lake, but once converted to a data lakehouse, scalability is high across the board. I think Azure Data Lake Storage would suit medium—to large enterprises.
How are customer service and support?
Microsoft's documentation is superb, and support is good, especially if you have a relevant intermediate supplier.
Which solution did I use previously and why did I switch?
We haven't compared Azure Data Lake Storage with products from other vendors because we're an Azure shop. We did check that the Azure product was good enough for our needs, and it was, so we didn't explore alternatives like AWS, Google, or Snowflake.
How was the initial setup?
The initial setup is fairly complex, but if you get your data architecture right from the start, it's not a problem. We're using a totally cloud-based deployment with Azure.
What other advice do I have?
Integration capabilities are fairly smooth and comparable to AWS in terms of cloud integration. Some might say it's slightly better, others slightly worse, but I think it's good. I'd rate Azure Data Lake Storage an eight out of ten. However, it's important to note that it's only eventually consistent, so don't expect immediate consistency when changes are made. It works well as a data storage bucket for future use, but it's unsuitable for transactional work. You need to use a data lakehouse like Databricks for transactional processes, which can handle transactional work once the data is in the correct format (like Parquet). The tool is great for storing data you want to put into a data lakehouse, but not for frequent transactions. It's suitable for daily archiving, but anything more frequent than that might cause issues.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 2, 2024
Flag as inappropriateData Architect /Data Engineer at Regional Council
Efficient data integration enabling modern data platforms
Pros and Cons
- "Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks."
- "Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format."
What is our primary use case?
The main use cases are for people who don't want their data to be siloed. They want to integrate it into one place, which is a data lake, where they can have data coming from different sources and formats. They want to be able to report on the data from one place, have different personnel work in the same place, and utilize the modern data platform.
What is most valuable?
Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks. Most recently, Microsoft Fabric has been the main tool I have recommended.
What needs improvement?
Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format. Additionally, improvements in supporting various formats would be beneficial.
For how long have I used the solution?
I've been working with Azure Data Lake Storage for about three years now.
What do I think about the stability of the solution?
I would rate stability between nine and ten. It is very stable, however, it depends on settings like geographical redundancy. It's also dependent on expertise and the company's willingness to pay for specific features.
What do I think about the scalability of the solution?
It is very scalable, especially the Gen 2 version. Azure Data Lake Gen 2 is very flexible and uses hierarchical file structures. The newest version, Gen 3, also supports a lake house approach and integrates with other cloud storage like Amazon S3, Google Storage, and Snowflake.
How are customer service and support?
In terms of community support, they respond between one to two weeks. Organization-paid technical support almost gets an immediate response. Overall, their service is rated eight.
How would you rate customer service and support?
Positive
How was the initial setup?
Microsoft Fabric is easier to use than Databricks. Fabric is a software as a service (SaaS), while Databricks is a platform as a service (PaaS, making Fabric easier for starters. Setting up Azure Data Lake Storage can be quick if done manually, but using code ensures scalability.
What's my experience with pricing, setup cost, and licensing?
It's very cheap to store large terabytes of data. It costs just a few dollars per terabyte per month. Computational costs could vary based on usage and are generally more expensive than storage.
What other advice do I have?
For startups, I recommend using Microsoft Fabric as it integrates well with other tools and is cost-efficient. The pricing model is easy to understand. I'd rate the solution nine out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Last updated: Sep 18, 2024
Flag as inappropriateAssociate Software Engineer at Systech Solutions
Can store different types of data - structured, unstructured, and semi-structured—in one lake
Pros and Cons
- "The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design."
- "I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data."
What is our primary use case?
We use Azure Data Lake Storage for sources like HubSpot (CRM software) and Xero (invoice software). We call their APIs, get the data, and store it in the product. From there, we use it to get the responses and load them into Azure SQL DB.
What is most valuable?
The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design.
What needs improvement?
I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.
For how long have I used the solution?
I have been using the product for a year.
What do I think about the stability of the solution?
The tool is a stable product; they've been improving it recently.
What do I think about the scalability of the solution?
About 400 to 500 people (40 to 50 percent of employees) use Azure Data Lake Storage for ETL or development. It's scalable - we can add or remove users and change permissions within minutes.
How are customer service and support?
I haven't talked directly to Microsoft support, but I use blogs and forums to find answers.
How was the initial setup?
Setting up and deploying Azure Data Lake Storage is easy. We use Azure DevOps to connect and deploy all our services.
What's my experience with pricing, setup cost, and licensing?
The tool is cheap, depending on the services and requirements you need.
What other advice do I have?
I use Azure Data Lake Storage in a cloud-only setup, not on-premises. We receive API calls and store the responses in the product. Then, we process these files using the tool.
For first-time users, I recommend learning from Microsoft materials or YouTube videos before using the tool. It is better to gain some knowledge before using it. It's easy for beginners to learn and use, especially compared to AWS and other services.
I'd rate Azure Data Lake Storage eight out of ten. I find it user-friendly as a fresher with about two point eight years in my tech career. I started my career with this tool, gained much knowledge, and now I can lead a team.
Disclosure: My company has a business relationship with this vendor other than being a customer: customer/partner
Last updated: Aug 19, 2024
Flag as inappropriateBuyer's Guide
Download our free Azure Data Lake Storage Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Product Categories
Cloud StorageBuyer's Guide
Download our free Azure Data Lake Storage Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- What is the best solution for an enterprise-level storage environment?
- Which is better—Box or Dropbox?
- What is the minimum security features set required for Cloud Backup and Storage Software?
- When evaluating Cloud Storage, what aspect do you think is the most important to look for?
- How can we build a healthy digital transformation pipeline in 2022?
- How would you recommend selecting a compute and storage solution based on the company size?
- What are the benefits and drawbacks of using cloud storage?
- What are your top 3 Cloud Storage predictions for 2022?
- With the increasing risk of cyber attacks in the west, due to the war in Ukraine, how safe is your data in the cloud?
- Why is Cloud Storage important for companies?