Senior Data Engineer at a computer software company with 1,001-5,000 employees
Real User
Top 10
2024-11-06T10:14:00Z
Nov 6, 2024
I work in a project where I build data pipelines using Azure Data Factory. I ingest data from on-premises to Azure Data Lake. After that, I perform transformations using Databricks notebooks and Spark, building the Databricks bronze, silver, and gold layers. We export reports from the gold layer.
Financial Analyst 4 (Supply Chain & Financial Analytics) at Juniper Networks
MSP
Top 5
2024-03-28T09:56:00Z
Mar 28, 2024
We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.
My company uses Databricks to process real-time and batch data with its streaming analytics part. We use Databricks' Unified Data Analytics Platform, for which we have Azure as a solution to bring the unified architecture on top of that to handle the streaming load for our platform.
Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture.
We mainly use Databricks to process ingest and do the ELT processes of data to get it ready for analytics and to serve the data to ThoughtSpot, which calls queries and Databricks to get the data.
Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.
Principal at a computer software company with 5,001-10,000 employees
Real User
2022-12-16T18:28:24Z
Dec 16, 2022
I've worked with Databricks primarily in the pharmaceuticals and life sciences space, which means a lot of work on patient-level data and the predictive analytics around that. Another use case for Databricks is in the manufacturing industry. I'm a consultant, so the use cases for the product vary, but my primary use case for it is in the pharma space.
Our company uses the solution's Spark module for big data analytics as a processing engine. We do not use the module as a streaming engine. The historic perception is that Spark is for batches, machine learning, analytics, and big data processing but not for streaming and that is exactly how we use it.
Vice President at a tech services company with 51-200 employees
Real User
2022-09-06T08:03:58Z
Sep 6, 2022
Our primary use case of this product is for our customers who are running large systems and looking for an API -- a quick, easy integration with their own system. We use Databricks to create a secure API interface. I'm vice president of data science and we are customers of Databricks.
Associate Principal - Data Engineering at LTI - Larsen & Toubro Infotech
Real User
2022-07-17T09:50:00Z
Jul 17, 2022
We build data solutions for the banking industry. Previously, we worked with AWS, but now we are on Azure. My role is to assess the current legacy applications and provide cloud alternatives based on the customers' requirements and expectations. Databricks is a unified platform that provides features like streaming and batch processing. All the data scientists, analysts, and engineers can collaborate on a single platform. It has all the features, you need, so you don't need to go for any other tool.
We are using Databricks to receive the data from Data Lake where we are processing it and doing the transformation, and cleansing. Once it is processed, we are sending the data to the Azure SQL database.
We use this solution for finding anomalies and applying the rules to the streaming data. There are around 50 people using this solution in my organization, including data scientists.
Director - Data Engineering expert at Sankir Technologies
Real User
2022-03-18T16:14:27Z
Mar 18, 2022
I use Databricks to explore new features and provide the industry visibility and scalability of Databricks to the companies that I work with. I create proof of concepts for companies. As a consultant, I also create training courses on Databricks. If a company wants to leverage a service provided by Databricks and needs to train people, they use our courses.
I believe we are using the new version. Our company makes comprehensive use of the solution to consolidate data and do a certain amount of reporting and analytics. All the data consumers use Databricks to develop the information.
Machine Learning Engineer at a mining and metals company with 10,001+ employees
Real User
2021-11-03T23:41:00Z
Nov 3, 2021
We were using Databricks to build an AI solution. We are only evaluating it, we have approximately three people that tried it out. Later we choose another solution, we did not fully deploy Databricks.
Lead Data Architect at a government with 1,001-5,000 employees
Real User
2021-04-21T14:10:02Z
Apr 21, 2021
We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.
The primary use is for data management and managing workloads of data pipelines. Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.
Cloud & Infra Security, Group Manager at Avanade
MSP
2021-01-10T08:08:17Z
Jan 10, 2021
We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions. My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.
Head of Data & Analytics at a tech services company with 11-50 employees
Real User
2020-12-08T10:26:21Z
Dec 8, 2020
We are a consulting house and we employ solutions based on our customers' needs. We don't generally use products internally. I am a certified data engineer from Microsoft and have worked on the Azure platform, which is why I have experience with Databricks. Now that Microsoft has launched Synapse, I think that there will be more use cases.
We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data.
Pre-sale Leader, Big Data Enterprise Solutions at Ness Technologies
Consultant
2020-04-13T06:27:36Z
Apr 13, 2020
My division works with Big Data and Data Science, and Databricks is one of the tools for Big Data that we work with. We are partners with Microsoft and we began working with this solution for one specific project in the financial industry.
Data Scientist at a energy/utilities company with 10,001+ employees
Real User
2020-02-09T08:17:00Z
Feb 9, 2020
I am a data scientist here and that is my official role. I own the company. Our team is quite small at this point. We have around five people on the team and we are working with about five different businesses. The projects we get from them are massive undertakings. Each of us on the team takes multiple roles in our company and we use multiple tools to help best serve our clients. We are trying to look at creative ways that different solutions can be integrated and we try to understand what products we can use to create solutions for client companies that will be effective in meeting their needs. We are personally using Databricks for certain projects where we want to consider creating intelligent solutions. I have been working on Databricks as part of my role in this company, trying to see if there are any kind of standard products that we can use with it to create solutions. We know that Databricks integrates with Airflow, so that is something that we are exploring right now as a potential solution for enabling a creative response. We are exploring the cloud as an option. Databricks is available in Azure and we are currently figuring out the viability of using that as a cloud platform. So we are exploring the way Databricks and Azure integrate at the same time to give us this type of flexibility. What we use it for right now is more like asset management. If we have a lot of assets and we get a lot of real-time data, we certainly want to do some processing on some of this data, but you do not want to have to work on all of it in real-time. That is why we use Databricks. We push the data from Azure through Databricks and work on the data algorithm in Databricks and execute it from Azure with probably an RPA (Robotic Process Automation) or something of that sort. It intelligently offloads real-time processing.
Vice President, Business Intelligence and Analytics at NTT Data India Enterprise Application Services Pri
Real User
2020-02-05T08:05:00Z
Feb 5, 2020
We are still exploring the solution. We utilize it much, much better than their star schema models that they are trying to replace it with. We bring in Databricks and then see how they can leverage the additional analytical functionalities around the Databricks cloud. It's more in exploratory ways. We recommend Databricks, especially with the Azure cloud frameworks.
We are building internal tools and custom models for predictive analysis. We are currently building a platform where we can integrate multiple data sources, such as data that is coming from Azure, AWS, or any SQL database. We integrate the data and run our models on top of that. We primarily use Databricks for data processing and for SQL databases.
Business Intelligence and Analytics Consultant at a tech services company with 201-500 employees
Consultant
2019-12-09T10:58:00Z
Dec 9, 2019
I am a developer and I do a lot of consulting using Databricks. We have been primarily using this solution for ETL purposes. We also do some migration of on-premises data to the cloud.
Our primary use case is really DevOps, for integration and continuous development. We've combined our database with some components from Azure to deploy elements in Sandbox for our data scientists and for our data engineers.
Data Scientist at a computer software company with 501-1,000 employees
Real User
Top 10
2019-10-14T12:39:00Z
Oct 14, 2019
We are using this solution to run large analytics queries and prepare datasets for SparkML and ML using PySpark. We ran on multiple clusters set up for a minimum of three and a maximum of nine nodes having 16GB RAM each. For one ad hoc requirement, a 32-node cluster was required. Databricks clusters were set for autoscaling and to time out after forty minutes of inactivity. Multiple users attached their notebooks to a cluster. When some workloads required different libraries, a dedicated cluster was spun up for that user.
Databricks is utilized for advanced analytics, big data processing, machine learning models, ETL operations, data engineering, streaming analytics, and integrating multiple data sources.
Organizations leverage Databricks for predictive analysis, data pipelines, data science, and unifying data architectures. It is also used for consulting projects, financial reporting, and creating APIs. Industries like insurance, retail, manufacturing, and pharmaceuticals use Databricks for data...
I work in a project where I build data pipelines using Azure Data Factory. I ingest data from on-premises to Azure Data Lake. After that, I perform transformations using Databricks notebooks and Spark, building the Databricks bronze, silver, and gold layers. We export reports from the gold layer.
We use the product as a data science platform that enables me to handle and analyze large datasets efficiently.
I use Databricks to manage the setting up of data lakes for SaaS.
We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.
We use the solution for data analytics of industrial data.
The product has helped in data fabrication.
It's mainly used for data science, data analytics, visualization, and industrial analytics.
My company uses Databricks to process real-time and batch data with its streaming analytics part. We use Databricks' Unified Data Analytics Platform, for which we have Azure as a solution to bring the unified architecture on top of that to handle the streaming load for our platform.
Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture.
We use the solution for business engineering.
We use it for data analysis and testing of high volume web user behavioral data.
We mainly use Databricks to process ingest and do the ELT processes of data to get it ready for analytics and to serve the data to ThoughtSpot, which calls queries and Databricks to get the data.
Our primary use case for the solution is to run batch jobs.
Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.
I've worked with Databricks primarily in the pharmaceuticals and life sciences space, which means a lot of work on patient-level data and the predictive analytics around that. Another use case for Databricks is in the manufacturing industry. I'm a consultant, so the use cases for the product vary, but my primary use case for it is in the pharma space.
Databricks is very useful and can handle thousands of different use cases. The use cases are all over the place.
I am using Databricks in my company.
We're using it to provide a unified development experience for all our data experts, including all data engineers, data scientists, and IT engineers.
We use this solution for the Customer Data Platform(CDP). My company works in the MarTech space and usually we implement custom CDP.
Our company uses the solution's Spark module for big data analytics as a processing engine. We do not use the module as a streaming engine. The historic perception is that Spark is for batches, machine learning, analytics, and big data processing but not for streaming and that is exactly how we use it.
Our primary use case of this product is for our customers who are running large systems and looking for an API -- a quick, easy integration with their own system. We use Databricks to create a secure API interface. I'm vice president of data science and we are customers of Databricks.
I am using Databricks for creating business intelligence solutions.
We build data solutions for the banking industry. Previously, we worked with AWS, but now we are on Azure. My role is to assess the current legacy applications and provide cloud alternatives based on the customers' requirements and expectations. Databricks is a unified platform that provides features like streaming and batch processing. All the data scientists, analysts, and engineers can collaborate on a single platform. It has all the features, you need, so you don't need to go for any other tool.
Databricks is the full data analytics platform. It involves end to end data analytics process.
I primarily use Databricks for data pipelines.
We are using Databricks to receive the data from Data Lake where we are processing it and doing the transformation, and cleansing. Once it is processed, we are sending the data to the Azure SQL database.
We use this solution for finding anomalies and applying the rules to the streaming data. There are around 50 people using this solution in my organization, including data scientists.
I use Databricks for customer marketing analytics.
I use Databricks to explore new features and provide the industry visibility and scalability of Databricks to the companies that I work with. I create proof of concepts for companies. As a consultant, I also create training courses on Databricks. If a company wants to leverage a service provided by Databricks and needs to train people, they use our courses.
I believe we are using the new version. Our company makes comprehensive use of the solution to consolidate data and do a certain amount of reporting and analytics. All the data consumers use Databricks to develop the information.
We were using Databricks to build an AI solution. We are only evaluating it, we have approximately three people that tried it out. Later we choose another solution, we did not fully deploy Databricks.
We have a team that works on Databricks for our clients. We are customers of Databricks.
Databricks can be used for large-scale data pre-processing and data transformations.
We use this solution to process data, for example, data validation.
We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.
We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.
We use this solution to build skill and text classification models.
The primary use is for data management and managing workloads of data pipelines. Databricks can also be used for data visualization, as well as to implement machine learning models. Machine learning development can be done using R, Python, and Spark programming.
We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions. My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.
We are a consulting house and we employ solutions based on our customers' needs. We don't generally use products internally. I am a certified data engineer from Microsoft and have worked on the Azure platform, which is why I have experience with Databricks. Now that Microsoft has launched Synapse, I think that there will be more use cases.
Currently, I am using this solution for a forecasting project.
Our primary use case is to decrease costs and prevent any security press on data. I'm an IT manager and we are customers of Databricks.
We specialize in project consulting for our clients. Whenever we get the opportunity, we recommend Databricks to them.
Our primary use case of Databricks is for advanced analytics. I'm the chief research officer of the company and we're customers of Databricks.
We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data.
My division works with Big Data and Data Science, and Databricks is one of the tools for Big Data that we work with. We are partners with Microsoft and we began working with this solution for one specific project in the financial industry.
I am a data scientist here and that is my official role. I own the company. Our team is quite small at this point. We have around five people on the team and we are working with about five different businesses. The projects we get from them are massive undertakings. Each of us on the team takes multiple roles in our company and we use multiple tools to help best serve our clients. We are trying to look at creative ways that different solutions can be integrated and we try to understand what products we can use to create solutions for client companies that will be effective in meeting their needs. We are personally using Databricks for certain projects where we want to consider creating intelligent solutions. I have been working on Databricks as part of my role in this company, trying to see if there are any kind of standard products that we can use with it to create solutions. We know that Databricks integrates with Airflow, so that is something that we are exploring right now as a potential solution for enabling a creative response. We are exploring the cloud as an option. Databricks is available in Azure and we are currently figuring out the viability of using that as a cloud platform. So we are exploring the way Databricks and Azure integrate at the same time to give us this type of flexibility. What we use it for right now is more like asset management. If we have a lot of assets and we get a lot of real-time data, we certainly want to do some processing on some of this data, but you do not want to have to work on all of it in real-time. That is why we use Databricks. We push the data from Azure through Databricks and work on the data algorithm in Databricks and execute it from Azure with probably an RPA (Robotic Process Automation) or something of that sort. It intelligently offloads real-time processing.
We are still exploring the solution. We utilize it much, much better than their star schema models that they are trying to replace it with. We bring in Databricks and then see how they can leverage the additional analytical functionalities around the Databricks cloud. It's more in exploratory ways. We recommend Databricks, especially with the Azure cloud frameworks.
We use the solution for multiple items. We use lots of data crunching, development, and algorithms on it.
We are building internal tools and custom models for predictive analysis. We are currently building a platform where we can integrate multiple data sources, such as data that is coming from Azure, AWS, or any SQL database. We integrate the data and run our models on top of that. We primarily use Databricks for data processing and for SQL databases.
We primarily use the solution to run current jobs; to run the spark jobs as the current job.
We use this solution for streaming analytics. We use machine learning functions that output to the API and work directly with the database.
I am a developer and I do a lot of consulting using Databricks. We have been primarily using this solution for ETL purposes. We also do some migration of on-premises data to the cloud.
Our primary use case is really DevOps, for integration and continuous development. We've combined our database with some components from Azure to deploy elements in Sandbox for our data scientists and for our data engineers.
We are using this solution to run large analytics queries and prepare datasets for SparkML and ML using PySpark. We ran on multiple clusters set up for a minimum of three and a maximum of nine nodes having 16GB RAM each. For one ad hoc requirement, a 32-node cluster was required. Databricks clusters were set for autoscaling and to time out after forty minutes of inactivity. Multiple users attached their notebooks to a cluster. When some workloads required different libraries, a dedicated cluster was spun up for that user.