Try our new research platform with insights from 80,000+ expert users
Richard Mottershead - PeerSpot reviewer
Enterprise Architect at a non-profit with 501-1,000 employees
Real User
Top 5Leaderboard
Able to partition data into various datasets using a directory hierarchy
Pros and Cons
  • "The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily."
  • "One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity."

What is most valuable?

The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily.

What needs improvement?

One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity.

For how long have I used the solution?

I have been using the product for a year. 

What do I think about the stability of the solution?

Stability is good if you build your Azure Data Lake Storage well in the first place.

Buyer's Guide
Azure Data Lake Storage
January 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.

What do I think about the scalability of the solution?

Scalability depends on process complexity—it is high for simple processes and low for complex ones. This is due to the architecture of a data lake, but once converted to a data lakehouse, scalability is high across the board. I think Azure Data Lake Storage would suit medium—to large enterprises. 

How are customer service and support?

Microsoft's documentation is superb, and support is good, especially if you have a relevant intermediate supplier.

Which solution did I use previously and why did I switch?

We haven't compared Azure Data Lake Storage with products from other vendors because we're an Azure shop. We did check that the Azure product was good enough for our needs, and it was, so we didn't explore alternatives like AWS, Google, or Snowflake.

How was the initial setup?

The initial setup is fairly complex, but if you get your data architecture right from the start, it's not a problem. We're using a totally cloud-based deployment with Azure.

What other advice do I have?

Integration capabilities are fairly smooth and comparable to AWS in terms of cloud integration. Some might say it's slightly better, others slightly worse, but I think it's good. I'd rate Azure Data Lake Storage an eight out of ten. However, it's important to note that it's only eventually consistent, so don't expect immediate consistency when changes are made. It works well as a data storage bucket for future use, but it's unsuitable for transactional work. You need to use a data lakehouse like Databricks for transactional processes, which can handle transactional work once the data is in the correct format (like Parquet). The tool is great for storing data you want to put into a data lakehouse, but not for frequent transactions. It's suitable for daily archiving, but anything more frequent than that might cause issues.

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
reviewer2402067 - PeerSpot reviewer
Technical Manager at a consultancy with 1,001-5,000 employees
Real User
Offers high scalability for data storage and SLAs for stability at an affordable price
Pros and Cons
  • "Offers high scalability for data storage"
  • "Lack of clarity in migration processes"

What is our primary use case?

At our company, we strategize and design solutions that have an impact on our client's business. Our company is majorly focused on technological projects for development through business strategies. Azure Data Lake Storage is one of the tools that helps our company develop business solutions for our customers. 

The solution is used primarily for storing Big Data and not for any analytics tasks. If the client's system is already on AWS, then we don't recommend Azure Data Lake Storage. 

What is most valuable?

Azure Data Lake Storage offers high scalability for data storage, whereas some storage systems in the market offer limited data capacity. Using the product, multiple documents can be read simultaneously. In products like Azure Data Lake Storage, storage scalability is a basic requirement to handle Big Data use cases. 

Almost any storage issue with the solution can be solved if you click on specific properties of the configuration.  

What needs improvement?

The Azure Data Lake Storage should not only be compatible with the Azure platforms but also with other vendor solutions. I have faced a lack of clarity in the migration process when Azure Synapse is being used with the solution to migrate data to Microsoft Fabric.

Traditionally, when our organization uses Azure Data Lake Storage, we also need to use Azure Synapse. But when data needs to be imported to Microsoft Fabric from the solution using Synapse, Microsoft does not provide a clear migration passage.

Thus integrators face difficulty in migrating data from Azure Data Lake Storage to other products from Microsoft. The parent owner of Azure products is focused on launching new products in a span of a few years but is not focusing on how customers will migrate to their latest product from other data storage systems. Microsoft is not developing the older products once they launch new solutions; they are just providing basic support for the former products. 

For how long have I used the solution?

I am a user of Azure Data Lake Storage. 

What do I think about the stability of the solution?

Azure Data Lake Storage is one of the rare services with SLAs that show major stability. On the other hand, if I consider GPT and OpenAI services in Azure, half of the SLAs remain locked. 

What do I think about the scalability of the solution?

I would rate the scalability a ten out of ten. I haven't encountered any data use cases so large that Azure Data Lake Storage wasn't able to manage and scale. For instance, when it comes to storing global transactions of Visa, even for such big use cases, Azure Data Lake Storage will be able to handle the data volume. 

At our company, we work with Azure Data Lake Storage for top-level enterprise companies. 

How are customer service and support?

Instead of reaching out to customer support every time an issue occurs with the product, our organization members use the documentation to resolve the problem. The technical specialists of our company are capable of solving most of the issues on their own, leveraging the documentation around Azure Data Lake Storage. 

How was the initial setup?

The initial setup of the product was super easy. There is a very straightforward process that involves visiting the Azure portal, creating accounts, and configuring the needed ratio of Azure Data Lake Storage. The security configuration of the solution is a bit complex and has multiple technicalities compared to the rest of the setup process. 

The solution's setup process can probably be improved by embedding a feature that guides users through the selections necessary to achieve a secure configuration.

Most of the networking configurations are easily available, and to avail of advanced security options, a user needs to visit the advanced setup section. If Azure Data Lake Storage needs to be integrated with a pre-built network and there is a requirement for authenticated user access and the accessible medium, all such sections of security need to be better guided by the solution provider.  

There needs to be additional assistance provided by the solution during the setup process for a highly secure configuration, especially for individuals who are not networking experts.

What's my experience with pricing, setup cost, and licensing?

Azure Data Lake Storage is one of the most affordable products available from the vendor. When the storage capacity offered by the solution is compared with the computation, Azure Data Lake Storage turns out to be a more affordable option than other databases. 

What other advice do I have?

Azure Data Lake Storage can be used not only for Big Data but also for normal-sized data because it's more cost-effective than other database solutions. The solution offers almost the same features as Snowflake but at a lower price. The cost-effectiveness is the major reason why in our company, we use Azure Data Lake Storage instead of Snowflake and Synapse. 

In our organization, we use Azure Data Lake Storage in integration with Spark and Azure Databricks. When you are working with an extremely large volume of data, you should prefer Azure Data Lake Storage over other data warehouses; other solutions may process data and queries faster but wouldn't be able to manage the data size. I would overall rate Azure Data Lake Storage a ten out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user
Buyer's Guide
Azure Data Lake Storage
January 2025
Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: January 2025.
831,265 professionals have used our research since 2012.
Consultant SAP BODS at GyanSys Inc.
Real User
Top 10
Integrates well with Microsoft ecosystem but lacks robust SAP connectivity
Pros and Cons
  • "It's a cloud-based tool within the Microsoft ecosystem, offering many benefits for data handling."
  • "From an SAP perspective, direct connectivity to SAP systems is an area that could be enhanced."

What is our primary use case?

It is for data warehousing. We currently work on using Databricks and Hadoop for data warehousing – figuring out initial data migrations is our focus right now.

What is most valuable?

It's a cloud-based tool within the Microsoft ecosystem, offering many benefits for data handling.

Our focus is on standardization. We're currently analyzing how it could work with Databricks and haven't explored Azure Data Lake Storage for storage extensively.

We've been actively working with big data analytics for the past three years. Initially, we used Microsoft APS (Analytics Platform System).

What needs improvement?

From an SAP perspective, direct connectivity to SAP systems is an area that could be enhanced. Our landscape heavily relies on SAP, and we find solutions like Snowflake and Databricks more integrated. Azure Data Lake Storage could improve by providing stronger connectivity options for SAP databases.

For how long have I used the solution?

My company is in the initialization stage. Like, we've only completed the initial setup phase.

So, we're in the analysis phase, working with a sandbox environment at the moment. It has been six months now. 

What do I think about the stability of the solution?

I would rate the stability an eight out of ten. 

What do I think about the scalability of the solution?

I would rate the scalability a six out of ten. We've primarily encountered issues during the initial migration from Hadoop to Azure Data Lake Storage.

There are more than 30 end users using it. It's more of a real-time kind of setup. We're focusing on continuous data replication.

How are customer service and support?

We had some troubles. 

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We used different solutions. We switched to Azure because of the increases data volume we needed to handle. 

How was the initial setup?

I would rate the experience with the initial setup a five out of ten, with ten being easy to set up. 

It took us a few weeks to set up. We haven't started that integration process yet. We're currently in the sandbox testing phase.

What about the implementation team?

Our landscape administrator team handles initial setups. We have a separate infra team for the deployment processes. 

There are around ten members in the team. 

What's my experience with pricing, setup cost, and licensing?

It's quite expensive. Compared to other options we've explored, I would rate the pricing a seven out of ten, with ten being expensive. 

What other advice do I have?

Overall, I would rate the solution a seven out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Mohammad-Huda - PeerSpot reviewer
Data & Analytics Practitioner (BIDW, Big Data) at a tech vendor with 10,001+ employees
Real User
Enhanced data integration with cost-effective storage, though better documentation is needed
Pros and Cons
  • "Storage within Azure Data Lake is cheaper, which is one of the reasons we moved to it."
  • "The documentation could be more user-friendly with better tutorials."

What is our primary use case?

We are using Databricks along with some other tools to have an automated process. The data from different sources gets loaded into Data Lake. My use case for Data Lake Storage is as an integration for various sources of data that are processed and loaded into the lake for subsequent analysis.

How has it helped my organization?

Using Azure Data Lake Storage has provided us with a cost-effective solution for data storage, which allows us to manage large volumes of data efficiently.

What is most valuable?

Storage within Azure Data Lake is cheaper, which is one of the reasons we moved to it. Another valuable feature is the flexibility to scale storage up or down as needed.

What needs improvement?

The documentation could be more user-friendly with better tutorials. While the initial setup is not too complex, it requires understanding various options and their implications. Improving this can help users understand the configuration process better.

For how long have I used the solution?

I have been using this solution for one to two years.

What do I think about the stability of the solution?

I would rate the stability an eight out of ten. It is quite stable.

What do I think about the scalability of the solution?

From a scalability point of view, it is easy to scale. The flexibility to expand or reduce capacity according to requirements is well-handled.

How are customer service and support?

I haven't engaged much with the technical support. I cannot provide an accurate mark for it.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

How was the initial setup?

The initial setup process is not very complex but requires detailed knowledge of the various configuration options. One needs to understand the implications of each option, such as cost and performance, which can make the setup process somewhat challenging without adequate documentation support.

What's my experience with pricing, setup cost, and licensing?

The pricing is average, not too high and not too cheap.

What other advice do I have?

Depending on your use case, Azure Data Lake Storage can benefit your organization. It is suitable for medium to large scale companies.

I'd rate the solution seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user
Anupam Mishra - PeerSpot reviewer
Data Analyst at a tech vendor with 10,001+ employees
Real User
Top 20
Enhanced data management with hierarchical storage and great support
Pros and Cons
  • "The hierarchical structure allows us to create multiple hierarchies inside, such as storage containers, directories, and subdirectories."
  • "Version control would be a great improvement."

What is our primary use case?

We are restoring external tables and data in Databricks, accessing those tables to read and write data using Azure Data Lake Storage. We use it for data quality purposes and store data to form external tables.

What is most valuable?

The hierarchical structure allows us to create multiple hierarchies inside, such as storage containers, directories, and subdirectories. It provides multiple edges and access control. We can define who can access which directory and restrict read and write operations.

What needs improvement?

Version control would be a great improvement. Currently, there is no version control, and if something is deleted, it's permanently gone. The addition of a trash item would help in recovering data deleted by mistake.

For how long have I used the solution?

We have been using the solution for about three to four years.

What do I think about the stability of the solution?

There is no downtime, and everything is superior. The SLA is 99.99%.

What do I think about the scalability of the solution?

It's good, so I would rate it as eight or nine out of ten. It handles large amounts of data efficiently.

How are customer service and support?

I would rate technical support ten out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The setup is too complex since we handle a large amount of data.

What about the implementation team?

Our team handles everything. We don't have any consultants or third-party integrators.

Which other solutions did I evaluate?

We did evaluate AWS as it has S3 buckets.

What other advice do I have?

I'd rate the solution eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Independent consultant at a hospitality company with 1-10 employees
Real User
Top 20
Offers good storage layer and security
Pros and Cons
  • "The tool offers a big storage layer. The security aspect is quite good. In Azure, there's an option for soft deletes and policy management. This allows us to store only the most up-to-date data while everything else can be policy-managed. This makes handling and management easier."
  • "If I had to nitpick, maybe the throughput could be faster - how quickly you can access data and how fast data can be written onto the Azure Data Lake Storage."

What is our primary use case?

We use the tool for multiple processes. We use it as a storage layer for files coming in from relational systems and data from real-time streaming systems. We also use it as a staging area for data scientists to consume.

What is most valuable?

The tool offers a big storage layer. The security aspect is quite good. In Azure, there's an option for soft deletes and policy management. This allows us to store only the most up-to-date data while everything else can be policy-managed. This makes handling and management easier.

What needs improvement?

If I had to nitpick, maybe the throughput could be faster - how quickly you can access data and how fast data can be written onto the Azure Data Lake Storage.

For how long have I used the solution?

I have been working with the product for five years. 

What do I think about the stability of the solution?

I rate the tool's stability a nine out of ten. 

What do I think about the scalability of the solution?

The tool is scalable as long as you pay more. 

How are customer service and support?

Support depends on what agreements you have with Microsoft. Many consultants and companies outside Microsoft can also provide expertise in maintaining and managing the Azure environment, especially the Data Lake environment. It doesn't have to be Microsoft. But if you're raising tickets with Microsoft to fix issues, they're pretty reasonable.

How would you rate customer service and support?

Neutral

How was the initial setup?

The tool's deployment is simple. I've worked with Azure Data Lake Storage in different scenarios. It can be on-premises, in the cloud, or a hybrid solution—it depends on the design. I've worked with it in both on-premises and cloud-based scenarios. For the last few years, as everyone's been transitioning to the cloud, we've mainly worked with cloud-based solutions.

What's my experience with pricing, setup cost, and licensing?

Pricing is tricky because it depends on the solution you're building and the type of Data Lake storage you use—hot or cold.

What other advice do I have?

The tool can be used by small and large companies. It's not restricted by price, so it's not just for high-end companies. Especially with cloud options available now, any company can potentially use it. 

For competitors, from a cloud-based provider perspective, you have Amazon, Google, and other cloud providers. If you are building your custom solution, you can use traditional SAN drives on-premise for data lake storage, which becomes expensive. I'd say the main competitors of the cloud options are Microsoft, AWS, and Google. There are potentially other providers like Alibaba, but I haven't used them, so I can't provide more information about them.

I have experience integrating AI solutions with Azure Data Lake Storage and helped design some of them. AI solutions access data similarly to downstream systems like ETL tools. For cloud providers, the connections to AI tools are typically built into their products.

I rate the overall solution a nine out of ten. I definitely recommend Azure Data Lake Storage. I have recommended it for all the solutions I've designed and built for my clients. I would recommend it to anybody considering entering the data space or looking at building warehouses, AI solutions, etc.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
reviewer2405217 - PeerSpot reviewer
Strategy Consultant at a computer software company with 201-500 employees
Consultant
Top 20
Hierarchical namespace for data and offers security measures like RLS
Pros and Cons
  • "Microsoft is quite good when it comes to integration. They have multiple connectors available and different ingestion and integration processes available."
  • "Microsoft, in general, needs to simplify its licensing model. That's one of the biggest issues with Microsoft. The licensing model is either quite difficult to understand or is constantly evolving."

What is our primary use case?

It is typically used in my data analytics workflows. I use it as a data lake. 

What is most valuable?

The hierarchical namespace is okay. It's not the kind of product I would have any issue with.

RLS (Row-Level Security) and these kinds of things have been effective for protecting data. 

Microsoft is quite good when it comes to integration. They have multiple connectors available and different ingestion and integration processes available.

What needs improvement?

There is room for improvement in Microsoft support. I didn't have a good experience with it. 

Microsoft, in general, needs to simplify its licensing model. That's one of the biggest issues with Microsoft. The licensing model is either quite difficult to understand or is constantly evolving.

I like the move from Data Lake to Lakehouse. I think it's more up to Microsoft, regarding the trends of the market and what organizations need.

For how long have I used the solution?

I use it, but it's integrated into Microsoft Fabric. 

What do I think about the stability of the solution?

I don't have any performance issues. It's a reliable kind of product.

I would rate the stability an eight out of ten. 

How are customer service and support?

Technical support is not something I'm using a lot because I had a poor experience with it in the past, but not on these products, technically. I'm not using a lot of Microsoft support currently.

So, I don't really need it anymore because I've benefited from colleagues and the community to help, and it helps a lot.

Which solution did I use previously and why did I switch?

I work with Microsoft products in general, but mostly Fabric. There is also Data Factory, Databricks, and Synapse. My company is a Microsoft partner only.

What's my experience with pricing, setup cost, and licensing?

Microsoft, in general, needs to simplify its licensing model. That's one of the biggest issues with Microsoft. The licensing model is either quite difficult to understand or is constantly evolving. I think there's a will to simplify it because one of the biggest client complaints is that the licensing model is always messy and evolving. It's quite difficult to understand.

However, I'm quite satisfied with the pricing. It's quite good.

What other advice do I have?

I would recommend it to other users. I would not recommend it to the smallest companies because you have an entry ticket while using this kind of tool. It's based on usage, but it's mostly beneficial for companies that have big architecture to build, that are looking for a big, complex architecture. It's quite relevant for that. 

If you, as a small company, would like to simplify and rationalize, you may have some packaged products that would not only provide storage or transformation, but a different kind of end-to-end experience that would be better.

Overall, I would rate the solution an eight out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user