Try our new research platform with insights from 80,000+ expert users
NarendraBelkar - PeerSpot reviewer
Manager, Project at a consultancy with 1,001-5,000 employees
Real User
Top 5
Enables operational reporting and machine learning implementations and helps users to create reusable frameworks
Pros and Cons
  • "We can implement one framework and utilize it for multiple use cases."
  • "Microsoft Fabric costs more since we are charged for all the components."

What is our primary use case?

We are using Azure Data Lake Storage as a secondary storage solution. We use it for operational reporting and machine learning implementations. We also use the security and encryption features.

What is most valuable?

We can use the solution to implement a data warehouse. I am working on machine learning use cases. I have implemented a reusable framework. We can implement one framework and utilize it for multiple use cases.

What needs improvement?

Microsoft Fabric costs more since we are charged for all the components.

What do I think about the scalability of the solution?

I rate the tool’s scalability an eight out of ten. There are a lot of customizable codes. I have created multiple frameworks. We have more than 300 users. After we start with our ML use cases, there will be more users.

Buyer's Guide
Azure Data Lake Storage
January 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.

How are customer service and support?

The support team provides quick support. It provides quick solutions when we create tickets.

How would you rate customer service and support?

Positive

How was the initial setup?

I rate the ease of setup a nine out of ten. It is pretty straightforward. We need three people to deploy the product. There are two developers on our team. Everyone on our team has more than 12 years of experience.

What's my experience with pricing, setup cost, and licensing?

The pricing is dependent on what we use and how we use it. In my previous project, we paid a lot of money for the tool. Now, we only use the product for storage purposes. The price will increase if the machine learning use cases are productionised.

The product is bundled with Microsoft. The vendor charges us based on services. Recently, they introduced Microsoft Fabric, which includes all components. It costs more because we are charged for all the services available. We will not choose it for the initial stages of implementation.

Which other solutions did I evaluate?

Whenever we provide solutions to our customers, we often compare the product to Amazon Web Services. I prefer Microsoft for mid-sized companies.

What other advice do I have?

I have been using Azure Synapse for the last seven years. In my current organization, we are exploring more options because SAP does not have machine-learning capabilities. We are planning to use Microsoft Purview to share data.

We don't have any maintenance needs. We do a little bit of monitoring. We have automated everything. Maintenance is mostly done by Microsoft. I started my career in Microsoft. I gradually moved from Microsoft Stack on-premises to the cloud. Overall, I rate the product a nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Mukesh  Kumar - PeerSpot reviewer
Senior Software Engineer at a tech consulting company with 10,001+ employees
Real User
Top 5
Easily integrates with a company's current workflow
Pros and Cons
  • "The response time and quality offered by the support team are good."
  • "The high price of the product is an area of concern where improvements are required."

What is our primary use case?

I use the solution in my company as per our project requirements. In my company, we are only putting data from on-premises RDBMS into Azure Data Lake Storage Gen2, and then the file is stored in parquet format. After the aforementioned process is followed, my company has another data engineering team, which reads those data further.

What is most valuable?

For writing data to data lake, my company uses Oracle GoldenGate for Big Data. With Oracle GoldenGate for Big Data, my company had to use Handler (Java Platform SE 8), but now we use HDFS Handler, and then using it, we have to configure some files and open some ports between a bank's private network to Azure Data Lake Storage Gen2. After opening all the aforementioned areas, my company is able to push the data to Azure Data Lake Storage Gen2.

What needs improvement?

In my company, we are not facing any slowness or other kinds of issues with the product. Each day in my company, we create new directories and put the current files into them, so there is the segregation part that is taken care of, and because of this, there are no issues with the tool.

In our company, one of the teams use Azure Databricks to read data from Azure Data Lake Storage's account and as per the business use case, they move data or take the data further. The project I am currently doing has only limited work. I haven't explored all the points associated with the tool.

The high price of the product is an area of concern where improvements are required.

For how long have I used the solution?

I have been using Azure Data Lake Storage for a year.

What do I think about the stability of the solution?

The product's stability is good.

What do I think about the scalability of the solution?

The scalability part of the product is very good, and my company has not faced any issues with it.

At present, 15 to 16 percent of the company uses the tool, but it will increase by a percent in the future.

How are customer service and support?

The response time and quality offered by the support team are good. I rate the technical support as nine out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The product's initial setup phase is not too simple, and it can be described as a moderate process.

The solution is deployed on the cloud.

What's my experience with pricing, setup cost, and licensing?

I rate the product price as two or three, where one is high, and ten is low. The product's price is really high.

What other advice do I have?

Integrating Azure Data Lake Storage into my company's current workflow was easy.

I recommend the product to those who plan to use it.

I rate the tool an eight to nine out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Buyer's Guide
Azure Data Lake Storage
January 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.
Manish  Purohit - PeerSpot reviewer
Sr. Cloud Solution Architect at Green Point Technology Services (I) Pvt. Ltd
Real User
Top 5Leaderboard
Hierarchical storage structure to store pre-generated static data in JSON format
Pros and Cons
  • "I also like its speed. It's basically built on Azure Blob Storage, which was already fast. But Azure Data Lake Storage adds the hierarchical structure for even better performance."
  • "Pricing is always a factor. It could be more affordable."

What is our primary use case?

We leverage its hierarchical storage structure to store pre-generated static data in JSON format. This improves performance significantly. Instead of pulling data from on-premises sources, we pull the pre-processed JSON files from Azure Data Lake Storage.

Our primary use case is storing processed data for direct access by end users. It's a good solution for that.

How has it helped my organization?

We have both an education product and several financial products. With our education product, we store pre-generated JSON files in Azure Data Lake Storage, representing different student tests in a hierarchical structure. This allows us to serve pre-made tests to thousands of students without hitting performance bottlenecks.

Similarly, for our financial products, we store the final calculated output in Azure Data Lake Storage for use with Power BI Embedded. Users get their Power BI data directly from the data lake. We’ve offloaded a lot of load from our core Azure SQL Server by using Azure Data Lake Storage.

Overall, we primarily use Azure Data Lake Storage to serve data to end-users, not for complex calculations or analytics.

What is most valuable?

The security features are great, especially the ability to use SAS tokens.

I also like its speed. It's basically built on Azure Blob Storage, which was already fast. But Azure Data Lake Storage adds the hierarchical structure for even better performance.

That's the biggest benefit of Azure Data Lake Storage – the hierarchical namespace. That structure is what makes it truly suitable for data lake scenarios.

What needs improvement?

Pricing is always a factor. It could be more affordable.

For how long have I used the solution?

We've been using Azure Data Lake Storage for about two years now.

What do I think about the stability of the solution?

The stability is excellent. I'd rate the stability a ten out of ten. 

What do I think about the scalability of the solution?

It is a highly scalable solution. 

How are customer service and support?

We haven't needed to contact support. Everything has worked smoothly.

Which solution did I use previously and why did I switch?

We used to use Databricks, but our license expired.

How was the initial setup?

The initial setup is straightforward. 

It took less than an hour. The most time-consuming part is deciding your folder structure and how you want to organize data. 

The actual creation of the hierarchical structure and data storage is simple; it shouldn't take more than an hour or two.

What about the implementation team?

Azure Data Lake Storage is cloud-based. We handled the implementation in-house

What was our ROI?

The performance of Azure Data Lake Storage has had a significant positive impact on the solution's cost management. 

By using Azure Data Lake Storage, we've been able to reduce our reliance on Azure SQL Database, which was running on higher tiers. This has led to cost savings.

We've seen benefits in both cost and performance.

What's my experience with pricing, setup cost, and licensing?

It's a pay-as-you-go model. Your charges are based on the amount of data you store.

There are no extra costs.

What other advice do I have?

It's great for improving the speed and accessibility of your static data.

For my use case, it is a good solution. So, I would rate it a ten out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Data Architect /Data Engineer at Regional Council
Real User
Efficient data integration enabling modern data platforms
Pros and Cons
  • "Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks."
  • "Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format."

What is our primary use case?

The main use cases are for people who don't want their data to be siloed. They want to integrate it into one place, which is a data lake, where they can have data coming from different sources and formats. They want to be able to report on the data from one place, have different personnel work in the same place, and utilize the modern data platform.

What is most valuable?

Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks. Most recently, Microsoft Fabric has been the main tool I have recommended.

What needs improvement?

Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format. Additionally, improvements in supporting various formats would be beneficial.

For how long have I used the solution?

I've been working with Azure Data Lake Storage for about three years now.

What do I think about the stability of the solution?

I would rate stability between nine and ten. It is very stable, however, it depends on settings like geographical redundancy. It's also dependent on expertise and the company's willingness to pay for specific features.

What do I think about the scalability of the solution?

It is very scalable, especially the Gen 2 version. Azure Data Lake Gen 2 is very flexible and uses hierarchical file structures. The newest version, Gen 3, also supports a lake house approach and integrates with other cloud storage like Amazon S3, Google Storage, and Snowflake.

How are customer service and support?

In terms of community support, they respond between one to two weeks. Organization-paid technical support almost gets an immediate response. Overall, their service is rated eight.

How would you rate customer service and support?

Positive

How was the initial setup?

Microsoft Fabric is easier to use than Databricks. Fabric is a software as a service (SaaS), while Databricks is a platform as a service (PaaS, making Fabric easier for starters. Setting up Azure Data Lake Storage can be quick if done manually, but using code ensures scalability.

What's my experience with pricing, setup cost, and licensing?

It's very cheap to store large terabytes of data. It costs just a few dollars per terabyte per month. Computational costs could vary based on usage and are generally more expensive than storage.

What other advice do I have?

For startups, I recommend using Microsoft Fabric as it integrates well with other tools and is cost-efficient. The pricing model is easy to understand. I'd rate the solution nine out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Martijn Imrich - PeerSpot reviewer
Product Manager at AfroUrembo
Real User
Top 20
Simple to configure and set up
Pros and Cons
  • "The solution's most valuable feature is its simplicity of configuration and setup."
  • "Simple migrations from one data lake to another take too much time and could be improved."

What is our primary use case?

We use the solution for the default data platform situation.

What is most valuable?

The solution's most valuable feature is its simplicity of configuration and setup. It also has good scalability, allowing you to add more data.

What needs improvement?

Simple migrations from one data lake to another take too much time and could be improved.

For how long have I used the solution?

I have been using Azure Data Lake Storage for a couple of months.

What do I think about the stability of the solution?

The solution’s stability is very good.

I rate the solution’s stability a nine out of ten.

What do I think about the scalability of the solution?

Our clients for Azure Data Lake Storage are usually enterprise businesses.

I rate the solution’s scalability a nine out of ten.

How was the initial setup?

The solution’s deployment takes a few weeks.

On a scale from one to ten, where one is difficult and ten is easy, I rate the solution's initial setup an eight out of ten.

What's my experience with pricing, setup cost, and licensing?

On a scale from one to ten, where one is cheap and ten is expensive, I rate the solution's pricing a seven out of ten.

What other advice do I have?

Azure Data Lake Storage has slightly impacted the speed of data access. The solution's integration capability is very good, and I rate it an eight out of ten. All AI runs on data, and it has to be stored. It is usually stored in such an environment. I would recommend the solution to other users because it has good price quality.

Overall, I rate the solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
Flag as inappropriate
PeerSpot user
Venkatesh Kollana - PeerSpot reviewer
Associate Software Engineer at Systech Solutions
Real User
Top 10
Can store different types of data - structured, unstructured, and semi-structured—in one lake
Pros and Cons
  • "The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design."
  • "I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data."

What is our primary use case?

We use Azure Data Lake Storage for sources like HubSpot (CRM software) and Xero (invoice software). We call their APIs, get the data, and store it in the product. From there, we use it to get the responses and load them into Azure SQL DB.

What is most valuable?

The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design.

What needs improvement?

I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.

For how long have I used the solution?

I have been using the product for a year. 

What do I think about the stability of the solution?

The tool is a stable product; they've been improving it recently.

What do I think about the scalability of the solution?

About 400 to 500 people (40 to 50 percent of employees) use Azure Data Lake Storage for ETL or development. It's scalable - we can add or remove users and change permissions within minutes.

How are customer service and support?

I haven't talked directly to Microsoft support, but I use blogs and forums to find answers. 

How was the initial setup?

Setting up and deploying Azure Data Lake Storage is easy. We use Azure DevOps to connect and deploy all our services.

What's my experience with pricing, setup cost, and licensing?

The tool is cheap, depending on the services and requirements you need.

What other advice do I have?

I use Azure Data Lake Storage in a cloud-only setup, not on-premises. We receive API calls and store the responses in the product. Then, we process these files using the tool.

For first-time users, I recommend learning from Microsoft materials or YouTube videos before using the tool. It is better to gain some knowledge before using it. It's easy for beginners to learn and use, especially compared to AWS and other services.

I'd rate Azure Data Lake Storage eight out of ten. I find it user-friendly as a fresher with about two point eight years in my tech career. I started my career with this tool, gained much knowledge, and now I can lead a team.

Disclosure: My company has a business relationship with this vendor other than being a customer: customer/partner
Flag as inappropriate
PeerSpot user
MACIEJPOLAKOWSKI - PeerSpot reviewer
Senior Manager at IT Squad
Real User
A cost-effective solution to store data and allows flexible capacity management
Pros and Cons
    • "The version was a bit outdated compared to the newer Microsoft Data Fabric offerings."

    What is our primary use case?

    We use the solution for storing data but don’t use Synapse to store data directly in it. Instead, Azure Synapse Analytics is utilized to analyze and process data in Data Lake Storage. Data Lake Storage is a large, scalable solution that handles extensive volumes of structured and unstructured data rather than a direct disk storage system.

    What needs improvement?

    In Azure Data Lake Storage, the tool we're using, Spark, handles the management, storage, retrieval, and organization of data. Spark employs its algorithms to abstract the underlying complexities. We don’t work with a large amount of data. If we were to handle larger datasets, we would need to focus more on optimizing storage and retrieval processes, as the efficiency of these operations would become more critical.

    The version was a bit outdated compared to the newer Microsoft Data Fabric offerings. For instance, the directory services are already available in Data Fabric, so I don't think adding them to Azure Data Lake Storage would be necessary. For example, Snowflake, a cloud data analytics platform, adds its capabilities and optimizations to Azure Data Lake Storage, such as improved performance or easier integration with SQL. Compared to other similar services, Azure Data Lake Storage remains very competitive.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for over a year.

    What do I think about the stability of the solution?

    Azure is a stable platform. These interruptions are relatively rare and usually last only a few minutes. It is good for data-oriented applications that don’t require continuous online processing.

    These brief outages do not significantly impact the quality of service. We haven’t experienced major stability issues with Azure Storage. 

    What do I think about the scalability of the solution?

    It is scalable.

    How are customer service and support?

    Any issues are handled by the team responsible for managing the platform.

    Which solution did I use previously and why did I switch?

    We primarily use Azure Synapse, which integrates with Azure Data Lake Storage. Synapse leverages the storage provided by Data Lake Storage, so both are part of the Azure ecosystem but remain distinct services.

    Another integration involves SQL Server, which serves data to various consumers as an SQL database. The main consumer is Power BI, which provides extensive reporting capabilities. Additionally, Azure Functions integrates with internal systems at the client’s end.

    What's my experience with pricing, setup cost, and licensing?

    It is a cost-effective solution.

    What other advice do I have?

    Using a cloud platform generally allows for flexible capacity management, meaning you can use and pay for resources only when needed. This is particularly useful for our customers, who can run Spark clusters in serverless mode. They only pay for the time they use the service, which is cost-effective since they don’t need constant access to high power and typically run jobs for shorter periods, like half an hour.

    It is available continuously and supports data archiving. However, since the current volume of data is not large, the client doesn’t need to focus on archiving or optimization. As their data grows and becomes more historical, they may need to optimize storage and archiving practices.

    The other team manages the integration tasks. The process is straightforward as long as the systems, functions, or other components interact with external systems. The ease of integration can depend on the intensity of the integration requirements.

    Overall, I rate the solution an eight out of ten.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Data Architecture and Engineering Specialist at coprocenva
    User
    Top 5
    Manages large data volumes and has user-friendly automation
    Pros and Cons
    • "Azure Data Lake Storage is user-friendly and easy to use."
    • "Azure Data Lake Storage is user-friendly and easy to use."
    • "Maybe the solution could be a bit more user-friendly."
    • "The scalability is limited. However, it's easy to set up."

    What is our primary use case?

    We use Azure Data Lake Storage for managing large data volumes in our big data projects.

    How has it helped my organization?

    I have configured the tool to automate the deletion of data and transfer data from one repository to another automatically.

    What is most valuable?

    Azure Data Lake Storage is user-friendly and easy to use. It effectively manages large data volumes and allows for automated configuration of data operations such as deletion and transfer between repositories.

    What needs improvement?

    Maybe the solution could be a bit more user-friendly.

    What do I think about the stability of the solution?

    It is very stable and reliable. It is a good solution that doesn't crash.

    What do I think about the scalability of the solution?

    The scalability is limited. However, it's easy to set up.

    How are customer service and support?

    The support from Microsoft for Azure products is good. It's timely.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    The initial setup is very easy with this tool.

    What's my experience with pricing, setup cost, and licensing?

    I am not familiar with the pricing.

    What other advice do I have?

    Overall, I would rate the Azure Data Lake Storage as nine out of ten.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user