Try our new research platform with insights from 80,000+ expert users
MACIEJPOLAKOWSKI - PeerSpot reviewer
Senior Manager at IT Squad
Real User
A cost-effective solution to store data and allows flexible capacity management
Pros and Cons
    • "The version was a bit outdated compared to the newer Microsoft Data Fabric offerings."

    What is our primary use case?

    We use the solution for storing data but don’t use Synapse to store data directly in it. Instead, Azure Synapse Analytics is utilized to analyze and process data in Data Lake Storage. Data Lake Storage is a large, scalable solution that handles extensive volumes of structured and unstructured data rather than a direct disk storage system.

    What needs improvement?

    In Azure Data Lake Storage, the tool we're using, Spark, handles the management, storage, retrieval, and organization of data. Spark employs its algorithms to abstract the underlying complexities. We don’t work with a large amount of data. If we were to handle larger datasets, we would need to focus more on optimizing storage and retrieval processes, as the efficiency of these operations would become more critical.

    The version was a bit outdated compared to the newer Microsoft Data Fabric offerings. For instance, the directory services are already available in Data Fabric, so I don't think adding them to Azure Data Lake Storage would be necessary. For example, Snowflake, a cloud data analytics platform, adds its capabilities and optimizations to Azure Data Lake Storage, such as improved performance or easier integration with SQL. Compared to other similar services, Azure Data Lake Storage remains very competitive.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for over a year.

    What do I think about the stability of the solution?

    Azure is a stable platform. These interruptions are relatively rare and usually last only a few minutes. It is good for data-oriented applications that don’t require continuous online processing.

    These brief outages do not significantly impact the quality of service. We haven’t experienced major stability issues with Azure Storage. 

    Buyer's Guide
    Azure Data Lake Storage
    October 2024
    Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
    813,161 professionals have used our research since 2012.

    What do I think about the scalability of the solution?

    It is scalable.

    How are customer service and support?

    Any issues are handled by the team responsible for managing the platform.

    Which solution did I use previously and why did I switch?

    We primarily use Azure Synapse, which integrates with Azure Data Lake Storage. Synapse leverages the storage provided by Data Lake Storage, so both are part of the Azure ecosystem but remain distinct services.

    Another integration involves SQL Server, which serves data to various consumers as an SQL database. The main consumer is Power BI, which provides extensive reporting capabilities. Additionally, Azure Functions integrates with internal systems at the client’s end.

    What's my experience with pricing, setup cost, and licensing?

    It is a cost-effective solution.

    What other advice do I have?

    Using a cloud platform generally allows for flexible capacity management, meaning you can use and pay for resources only when needed. This is particularly useful for our customers, who can run Spark clusters in serverless mode. They only pay for the time they use the service, which is cost-effective since they don’t need constant access to high power and typically run jobs for shorter periods, like half an hour.

    It is available continuously and supports data archiving. However, since the current volume of data is not large, the client doesn’t need to focus on archiving or optimization. As their data grows and becomes more historical, they may need to optimize storage and archiving practices.

    The other team manages the integration tasks. The process is straightforward as long as the systems, functions, or other components interact with external systems. The ease of integration can depend on the intensity of the integration requirements.

    Overall, I rate the solution an eight out of ten.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Martijn Imrich - PeerSpot reviewer
    Product Manager at AfroUrembo
    Real User
    Top 20
    Simple to configure and set up
    Pros and Cons
    • "The solution's most valuable feature is its simplicity of configuration and setup."
    • "Simple migrations from one data lake to another take too much time and could be improved."

    What is our primary use case?

    We use the solution for the default data platform situation.

    What is most valuable?

    The solution's most valuable feature is its simplicity of configuration and setup. It also has good scalability, allowing you to add more data.

    What needs improvement?

    Simple migrations from one data lake to another take too much time and could be improved.

    For how long have I used the solution?

    I have been using Azure Data Lake Storage for a couple of months.

    What do I think about the stability of the solution?

    The solution’s stability is very good.

    I rate the solution’s stability a nine out of ten.

    What do I think about the scalability of the solution?

    Our clients for Azure Data Lake Storage are usually enterprise businesses.

    I rate the solution’s scalability a nine out of ten.

    How was the initial setup?

    The solution’s deployment takes a few weeks.

    On a scale from one to ten, where one is difficult and ten is easy, I rate the solution's initial setup an eight out of ten.

    What's my experience with pricing, setup cost, and licensing?

    On a scale from one to ten, where one is cheap and ten is expensive, I rate the solution's pricing a seven out of ten.

    What other advice do I have?

    Azure Data Lake Storage has slightly impacted the speed of data access. The solution's integration capability is very good, and I rate it an eight out of ten. All AI runs on data, and it has to be stored. It is usually stored in such an environment. I would recommend the solution to other users because it has good price quality.

    Overall, I rate the solution an eight out of ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
    Flag as inappropriate
    PeerSpot user
    Buyer's Guide
    Azure Data Lake Storage
    October 2024
    Learn what your peers think about Azure Data Lake Storage. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
    813,161 professionals have used our research since 2012.
    Richard Mottershead - PeerSpot reviewer
    Enterprise Architect at a non-profit with 501-1,000 employees
    Real User
    Top 5Leaderboard
    Able to partition data into various datasets using a directory hierarchy
    Pros and Cons
    • "The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily."
    • "One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity."

    What is most valuable?

    The most valuable feature of Azure Data Lake Storage is the ability to partition data into various datasets using a directory hierarchy. This folder structure is key for any delivery. Currently, we're not doing much with the data in the tool, but when Databricks comes along, we'll convert it to Parquet format. It's a two-step process: raw data is moved to Parquet, which Databricks can manipulate easily.

    What needs improvement?

    One improvement I'd suggest is the out-of-the-box conversion of input data, like spreadsheet or table data, to various formats. We'll be using Parquet, which enables transactional integrity.

    For how long have I used the solution?

    I have been using the product for a year. 

    What do I think about the stability of the solution?

    Stability is good if you build your Azure Data Lake Storage well in the first place.

    What do I think about the scalability of the solution?

    Scalability depends on process complexity—it is high for simple processes and low for complex ones. This is due to the architecture of a data lake, but once converted to a data lakehouse, scalability is high across the board. I think Azure Data Lake Storage would suit medium—to large enterprises. 

    How are customer service and support?

    Microsoft's documentation is superb, and support is good, especially if you have a relevant intermediate supplier.

    Which solution did I use previously and why did I switch?

    We haven't compared Azure Data Lake Storage with products from other vendors because we're an Azure shop. We did check that the Azure product was good enough for our needs, and it was, so we didn't explore alternatives like AWS, Google, or Snowflake.

    How was the initial setup?

    The initial setup is fairly complex, but if you get your data architecture right from the start, it's not a problem. We're using a totally cloud-based deployment with Azure.

    What other advice do I have?

    Integration capabilities are fairly smooth and comparable to AWS in terms of cloud integration. Some might say it's slightly better, others slightly worse, but I think it's good. I'd rate Azure Data Lake Storage an eight out of ten. However, it's important to note that it's only eventually consistent, so don't expect immediate consistency when changes are made. It works well as a data storage bucket for future use, but it's unsuitable for transactional work. You need to use a data lakehouse like Databricks for transactional processes, which can handle transactional work once the data is in the correct format (like Parquet). The tool is great for storing data you want to put into a data lakehouse, but not for frequent transactions. It's suitable for daily archiving, but anything more frequent than that might cause issues.

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Data Architect /Data Engineer at Regional Council
    Real User
    Efficient data integration enabling modern data platforms
    Pros and Cons
    • "Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks."
    • "Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format."

    What is our primary use case?

    The main use cases are for people who don't want their data to be siloed. They want to integrate it into one place, which is a data lake, where they can have data coming from different sources and formats. They want to be able to report on the data from one place, have different personnel work in the same place, and utilize the modern data platform.

    What is most valuable?

    Azure Data Lake itself doesn't have built-in features for handling data on its own. It's used as a collection of data. Tools you can use with it include Azure Data Factory and Databricks. Most recently, Microsoft Fabric has been the main tool I have recommended.

    What needs improvement?

    Azure Data Lake Storage should support other formats apart from the Data Lake format, such as the Iceberg format. Additionally, improvements in supporting various formats would be beneficial.

    For how long have I used the solution?

    I've been working with Azure Data Lake Storage for about three years now.

    What do I think about the stability of the solution?

    I would rate stability between nine and ten. It is very stable, however, it depends on settings like geographical redundancy. It's also dependent on expertise and the company's willingness to pay for specific features.

    What do I think about the scalability of the solution?

    It is very scalable, especially the Gen 2 version. Azure Data Lake Gen 2 is very flexible and uses hierarchical file structures. The newest version, Gen 3, also supports a lake house approach and integrates with other cloud storage like Amazon S3, Google Storage, and Snowflake.

    How are customer service and support?

    In terms of community support, they respond between one to two weeks. Organization-paid technical support almost gets an immediate response. Overall, their service is rated eight.

    How would you rate customer service and support?

    Positive

    How was the initial setup?

    Microsoft Fabric is easier to use than Databricks. Fabric is a software as a service (SaaS), while Databricks is a platform as a service (PaaS, making Fabric easier for starters. Setting up Azure Data Lake Storage can be quick if done manually, but using code ensures scalability.

    What's my experience with pricing, setup cost, and licensing?

    It's very cheap to store large terabytes of data. It costs just a few dollars per terabyte per month. Computational costs could vary based on usage and are generally more expensive than storage.

    What other advice do I have?

    For startups, I recommend using Microsoft Fabric as it integrates well with other tools and is cost-efficient. The pricing model is easy to understand. I'd rate the solution nine out of ten.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Venkatesh Kollana - PeerSpot reviewer
    Associate Software Engineer at Systech Solutions
    Real User
    Top 20
    Can store different types of data - structured, unstructured, and semi-structured—in one lake
    Pros and Cons
    • "The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design."
    • "I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data."

    What is our primary use case?

    We use Azure Data Lake Storage for sources like HubSpot (CRM software) and Xero (invoice software). We call their APIs, get the data, and store it in the product. From there, we use it to get the responses and load them into Azure SQL DB.

    What is most valuable?

    The tool's best feature is that it can store different types of data - structured, unstructured, and semi-structured—in one lake. We can use the required data for analytics and dashboard design.

    What needs improvement?

    I suggest enhancing the connectors for improvement. Some software, like HubSpot and Xero, has connectors, but they're limited to a few fields. That's why we use REST API calls instead. It would be better if the connectors could retrieve all data.

    For how long have I used the solution?

    I have been using the product for a year. 

    What do I think about the stability of the solution?

    The tool is a stable product; they've been improving it recently.

    What do I think about the scalability of the solution?

    About 400 to 500 people (40 to 50 percent of employees) use Azure Data Lake Storage for ETL or development. It's scalable - we can add or remove users and change permissions within minutes.

    How are customer service and support?

    I haven't talked directly to Microsoft support, but I use blogs and forums to find answers. 

    How was the initial setup?

    Setting up and deploying Azure Data Lake Storage is easy. We use Azure DevOps to connect and deploy all our services.

    What's my experience with pricing, setup cost, and licensing?

    The tool is cheap, depending on the services and requirements you need.

    What other advice do I have?

    I use Azure Data Lake Storage in a cloud-only setup, not on-premises. We receive API calls and store the responses in the product. Then, we process these files using the tool.

    For first-time users, I recommend learning from Microsoft materials or YouTube videos before using the tool. It is better to gain some knowledge before using it. It's easy for beginners to learn and use, especially compared to AWS and other services.

    I'd rate Azure Data Lake Storage eight out of ten. I find it user-friendly as a fresher with about two point eight years in my tech career. I started my career with this tool, gained much knowledge, and now I can lead a team.

    Disclosure: My company has a business relationship with this vendor other than being a customer: customer/partner
    Flag as inappropriate
    PeerSpot user
    Technical Manager at a consultancy with 1,001-5,000 employees
    Real User
    Offers high scalability for data storage and SLAs for stability at an affordable price
    Pros and Cons
    • "Offers high scalability for data storage"
    • "Lack of clarity in migration processes"

    What is our primary use case?

    At our company, we strategize and design solutions that have an impact on our client's business. Our company is majorly focused on technological projects for development through business strategies. Azure Data Lake Storage is one of the tools that helps our company develop business solutions for our customers. 

    The solution is used primarily for storing Big Data and not for any analytics tasks. If the client's system is already on AWS, then we don't recommend Azure Data Lake Storage. 

    What is most valuable?

    Azure Data Lake Storage offers high scalability for data storage, whereas some storage systems in the market offer limited data capacity. Using the product, multiple documents can be read simultaneously. In products like Azure Data Lake Storage, storage scalability is a basic requirement to handle Big Data use cases. 

    Almost any storage issue with the solution can be solved if you click on specific properties of the configuration.  

    What needs improvement?

    The Azure Data Lake Storage should not only be compatible with the Azure platforms but also with other vendor solutions. I have faced a lack of clarity in the migration process when Azure Synapse is being used with the solution to migrate data to Microsoft Fabric.

    Traditionally, when our organization uses Azure Data Lake Storage, we also need to use Azure Synapse. But when data needs to be imported to Microsoft Fabric from the solution using Synapse, Microsoft does not provide a clear migration passage.

    Thus integrators face difficulty in migrating data from Azure Data Lake Storage to other products from Microsoft. The parent owner of Azure products is focused on launching new products in a span of a few years but is not focusing on how customers will migrate to their latest product from other data storage systems. Microsoft is not developing the older products once they launch new solutions; they are just providing basic support for the former products. 

    For how long have I used the solution?

    I am a user of Azure Data Lake Storage. 

    What do I think about the stability of the solution?

    Azure Data Lake Storage is one of the rare services with SLAs that show major stability. On the other hand, if I consider GPT and OpenAI services in Azure, half of the SLAs remain locked. 

    What do I think about the scalability of the solution?

    I would rate the scalability a ten out of ten. I haven't encountered any data use cases so large that Azure Data Lake Storage wasn't able to manage and scale. For instance, when it comes to storing global transactions of Visa, even for such big use cases, Azure Data Lake Storage will be able to handle the data volume. 

    At our company, we work with Azure Data Lake Storage for top-level enterprise companies. 

    How are customer service and support?

    Instead of reaching out to customer support every time an issue occurs with the product, our organization members use the documentation to resolve the problem. The technical specialists of our company are capable of solving most of the issues on their own, leveraging the documentation around Azure Data Lake Storage. 

    How was the initial setup?

    The initial setup of the product was super easy. There is a very straightforward process that involves visiting the Azure portal, creating accounts, and configuring the needed ratio of Azure Data Lake Storage. The security configuration of the solution is a bit complex and has multiple technicalities compared to the rest of the setup process. 

    The solution's setup process can probably be improved by embedding a feature that guides users through the selections necessary to achieve a secure configuration.

    Most of the networking configurations are easily available, and to avail of advanced security options, a user needs to visit the advanced setup section. If Azure Data Lake Storage needs to be integrated with a pre-built network and there is a requirement for authenticated user access and the accessible medium, all such sections of security need to be better guided by the solution provider.  

    There needs to be additional assistance provided by the solution during the setup process for a highly secure configuration, especially for individuals who are not networking experts.

    What's my experience with pricing, setup cost, and licensing?

    Azure Data Lake Storage is one of the most affordable products available from the vendor. When the storage capacity offered by the solution is compared with the computation, Azure Data Lake Storage turns out to be a more affordable option than other databases. 

    What other advice do I have?

    Azure Data Lake Storage can be used not only for Big Data but also for normal-sized data because it's more cost-effective than other database solutions. The solution offers almost the same features as Snowflake but at a lower price. The cost-effectiveness is the major reason why in our company, we use Azure Data Lake Storage instead of Snowflake and Synapse. 

    In our organization, we use Azure Data Lake Storage in integration with Spark and Azure Databricks. When you are working with an extremely large volume of data, you should prefer Azure Data Lake Storage over other data warehouses; other solutions may process data and queries faster but wouldn't be able to manage the data size. I would overall rate Azure Data Lake Storage a ten out of ten. 

    Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
    Flag as inappropriate
    PeerSpot user
    Consultant SAP BODS at GyanSys Inc.
    Real User
    Top 10
    Integrates well with Microsoft ecosystem but lacks robust SAP connectivity
    Pros and Cons
    • "It's a cloud-based tool within the Microsoft ecosystem, offering many benefits for data handling."
    • "From an SAP perspective, direct connectivity to SAP systems is an area that could be enhanced."

    What is our primary use case?

    It is for data warehousing. We currently work on using Databricks and Hadoop for data warehousing – figuring out initial data migrations is our focus right now.

    What is most valuable?

    It's a cloud-based tool within the Microsoft ecosystem, offering many benefits for data handling.

    Our focus is on standardization. We're currently analyzing how it could work with Databricks and haven't explored Azure Data Lake Storage for storage extensively.

    We've been actively working with big data analytics for the past three years. Initially, we used Microsoft APS (Analytics Platform System).

    What needs improvement?

    From an SAP perspective, direct connectivity to SAP systems is an area that could be enhanced. Our landscape heavily relies on SAP, and we find solutions like Snowflake and Databricks more integrated. Azure Data Lake Storage could improve by providing stronger connectivity options for SAP databases.

    For how long have I used the solution?

    My company is in the initialization stage. Like, we've only completed the initial setup phase.

    So, we're in the analysis phase, working with a sandbox environment at the moment. It has been six months now. 

    What do I think about the stability of the solution?

    I would rate the stability an eight out of ten. 

    What do I think about the scalability of the solution?

    I would rate the scalability a six out of ten. We've primarily encountered issues during the initial migration from Hadoop to Azure Data Lake Storage.

    There are more than 30 end users using it. It's more of a real-time kind of setup. We're focusing on continuous data replication.

    How are customer service and support?

    We had some troubles. 

    How would you rate customer service and support?

    Neutral

    Which solution did I use previously and why did I switch?

    We used different solutions. We switched to Azure because of the increases data volume we needed to handle. 

    How was the initial setup?

    I would rate the experience with the initial setup a five out of ten, with ten being easy to set up. 

    It took us a few weeks to set up. We haven't started that integration process yet. We're currently in the sandbox testing phase.

    What about the implementation team?

    Our landscape administrator team handles initial setups. We have a separate infra team for the deployment processes. 

    There are around ten members in the team. 

    What's my experience with pricing, setup cost, and licensing?

    It's quite expensive. Compared to other options we've explored, I would rate the pricing a seven out of ten, with ten being expensive. 

    What other advice do I have?

    Overall, I would rate the solution a seven out of ten. 

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user
    Mohammad-Huda - PeerSpot reviewer
    Data & Analytics Practitioner (BIDW, Big Data) at a tech vendor with 10,001+ employees
    Real User
    Enhanced data integration with cost-effective storage, though better documentation is needed
    Pros and Cons
    • "Storage within Azure Data Lake is cheaper, which is one of the reasons we moved to it."
    • "The documentation could be more user-friendly with better tutorials."

    What is our primary use case?

    We are using Databricks along with some other tools to have an automated process. The data from different sources gets loaded into Data Lake. My use case for Data Lake Storage is as an integration for various sources of data that are processed and loaded into the lake for subsequent analysis.

    How has it helped my organization?

    Using Azure Data Lake Storage has provided us with a cost-effective solution for data storage, which allows us to manage large volumes of data efficiently.

    What is most valuable?

    Storage within Azure Data Lake is cheaper, which is one of the reasons we moved to it. Another valuable feature is the flexibility to scale storage up or down as needed.

    What needs improvement?

    The documentation could be more user-friendly with better tutorials. While the initial setup is not too complex, it requires understanding various options and their implications. Improving this can help users understand the configuration process better.

    For how long have I used the solution?

    I have been using this solution for one to two years.

    What do I think about the stability of the solution?

    I would rate the stability an eight out of ten. It is quite stable.

    What do I think about the scalability of the solution?

    From a scalability point of view, it is easy to scale. The flexibility to expand or reduce capacity according to requirements is well-handled.

    How are customer service and support?

    I haven't engaged much with the technical support. I cannot provide an accurate mark for it.

    How would you rate customer service and support?

    Neutral

    Which solution did I use previously and why did I switch?

    How was the initial setup?

    The initial setup process is not very complex but requires detailed knowledge of the various configuration options. One needs to understand the implications of each option, such as cost and performance, which can make the setup process somewhat challenging without adequate documentation support.

    What's my experience with pricing, setup cost, and licensing?

    The pricing is average, not too high and not too cheap.

    What other advice do I have?

    Depending on your use case, Azure Data Lake Storage can benefit your organization. It is suitable for medium to large scale companies.

    I'd rate the solution seven out of ten.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
    Flag as inappropriate
    PeerSpot user