Try our new research platform with insights from 80,000+ expert users

Databricks vs Teradata comparison

Sponsored
 

Comparison Buyer's Guide

Executive Summary
 

Categories and Ranking

IBM SPSS Statistics
Sponsored
Average Rating
8.0
Number of Reviews
36
Ranking in other categories
Data Mining (3rd), Data Science Platforms (10th)
Databricks
Average Rating
8.2
Number of Reviews
82
Ranking in other categories
Data Science Platforms (1st), Streaming Analytics (2nd)
Teradata
Average Rating
8.2
Number of Reviews
74
Ranking in other categories
Customer Experience Management (3rd), Backup and Recovery (20th), Data Integration (17th), Relational Databases Tools (7th), Data Warehouse (3rd), BI (Business Intelligence) Tools (10th), Marketing Management (6th), Cloud Data Warehouse (6th)
 

Mindshare comparison

Data Science Platforms
Data Warehouse
 

Featured Reviews

AbakarAhmat - PeerSpot reviewer
Sep 21, 2023
Enhancing survey analysis that provides valued insightfulness
I use it to analyze questionnaire surveys related to a product, solution, or application, such as open data services, which I provide to consumers and end-users. These surveys contain evaluation assessments, and I use SPSS to analyze the responses The most valuable feature is its robust…
Dunstan Matekenya - PeerSpot reviewer
Jul 10, 2024
Process large-scale data sets and integrates with Apache Spark with notebook environment
I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of…
SurjitChoudhury - PeerSpot reviewer
Feb 20, 2024
Offers seamless integration capabilities and performance optimization features, including extensive indexing and advanced tuning capabilities
We created and constructed the warehouse. We used multiple loading processes like MultiLoad, FastLoad, and Teradata Pump. But those are loading processes, and Teradata is a powerful tool because if we consider older technologies, its architecture with nodes, virtual processes, and nodes is a unique concept. Later, other technologies like Informatica also adopted the concept of nodes from Informatica PowerCenter version 7.x. Previously, it was a client-server architecture, but later, it changed to the nodes concept. Like, we can have the database available 24/7, 365 days. If one node fails, other nodes can take care of it. Informatica adopted all those concepts when it changed its architecture. Even Oracle databases have since adapted their architecture to them. However, this particular Teradata company initially started with its own different type of architecture, which major companies later adopted. It has grown now, but initially, whatever query we sent it would be mapped into a particular component. After that, it goes to the virtual processor and down to the disk, where the actual physical data is loaded. So, in between, there's a map, which acts like a data dictionary. It also holds information about each piece of data, where it's loaded, and on which particular virtual processor or node the data resides. Because Teradata comes with a four-node architecture, or however many nodes we choose, the cost is determined by that initially. So, what type of data does each and every node hold? It's a shared-no architecture. So, whatever task is given to a virtual processor it will be processed. If there's a failure, then it will be taken care of by another virtual processor. Moreover, this solution has impacted the query time and data performance. In Teradata, there's a lot of joining, partitioning, and indexing of records. There are primary and secondary indexes, hash indexing, and other indexing processes. To improve query performance, we first analyze the query and tune it. If a join needs a secondary index, which plays a major role in filtering records, we might reconstruct that particular table with the secondary index. This tuning involves partitioning and indexing. We use these tools and technologies to fine-tune performance. When it comes to integration, tools like Informatica seamlessly connect with Teradata. We ensure the Teradata database is configured correctly in Informatica, including the proper hostname and properties for the load process. We didn't find any major complexity or issues with integration. But, these technologies are quite old now. With newer big data technologies, we've worked with a four-layer architecture, pulling data from Hadoop Lake to Teradata. We configure Teradata with the appropriate hostname and credentials, and use BTEQ queries to load data. Previously, we converted the data warehouse to a CLD model as per Teradata's standardized procedures, moving from an ETL to an EMT process. This allowed us to perform gap analysis on missing entities based on the model and retrieve them from the source system again. We found Teradata integration straightforward and compatible with other tools.

Quotes from Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
 

Pros

"Custom tables and macros: They allow us to create useful reports quickly for a broad audience."
"The solution has numerous valuable features. We particularly like custom tabs. It's very useful. We end up analyzing a lot of software data, so features related to custom tabs are really helpful."
"SPSS is quite robust and quicker in terms of providing you the output."
"The best part is that they have an algorithm handbook, so you can open it up and understand how it works, and if it is useful, this is very important."
"The most valuable feature of IBM SPSS Statistics is all the functionality it provides. Additionally, it is simple to do the five-way analysis that you can into multidimensional setup space. It's the multidimensional space facility that is most useful."
"One feature I found very valuable was the analysis of variance (ANOVA)."
"The most valuable features mainly include factor analysis, correlation analysis, and geographic analysis."
"It has the ability to easily change any variable in our research."
"The solution is easy to use and has a quick start-up time due to being on the cloud."
"Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform."
"Databricks helps crunch petabytes of data in a very short period of time."
"The setup is quite easy."
"The processing capacity is tremendous in the database."
"The initial setup is pretty easy."
"The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
"Databricks is a scalable solution. It is the largest advantage of the solution."
"​We really enjoy the FastLoad, TPump, and MultiLoad features.​"
"It's very, very fast"
"The most valuable feature is the ease of running queries."
"It is a stable solution. Stability-wise, I rate the solution a nine out of ten."
"The most valuable features are the large volume of data and the structuring of the data to optimize it and get very optimal data warehouse solutions for customers."
"The most valuable feature is the ease of uploading data from multiple sources."
"Teradata's pretty fast."
"It's very mature from a technology perspective."
 

Cons

"I would like SPSS to improve its integration with other data-filing IBM tools. I also think its duration with data, utilization, and graphics could be better."
"The reports could be better."
"The technical support should be improved."
"The statistics should be more self-explanatory with detailed automated reports."
"In developing countries, it would be beneficial to provide certain features to users at no cost initially, while also customizing pricing options."
"I think the visualization and charting should be changed and made easier and more effective."
"SPSS slows down the computer or the laptop if the data is huge; then you need a faster computer."
"Improvements are needed in the user interface, particularly in terms of user-friendliness."
"There is room for improvement in the documentation of processes and how it works."
"The integration features could be more interesting, more involved."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
"In the next release, I would like to see more optimization features."
"Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks."
"The product needs samples and templates to help invite users to see results and understand what the product can do."
"There would also be benefits if more options were available for workers, or the clusters of the two points."
"Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with."
"It needs a teaching web site with more training on third-party tools used for BI."
"I would like more security and speed."
"The cloud is the new challenge and the new opportunity."
"It could be a bit more user-friendly."
"I'm not sure about the unstructured data management capabilities. It could be improved."
"There are some ways that the handling of unstructured data could be improved."
"The initial setup is complex because there are a lot of factors that come into play, including the amount of software and applications that require access."
"Teradata is an old data warehouse, and they're not improving in terms of new, innovative features."
 

Pricing and Cost Advice

"The price of this solution is a little bit high, which was a problem for my company."
"We think that IBM SPSS is expensive for this function."
"It's quite expensive, but they do a special deal for universities."
"The price of IBM SPSS Statistics could improve."
"SPSS is an expensive piece of software because it's incredibly complex and has been refined over decades, but I would say it's fairly priced."
"If it requires lot of data processing, maybe switching to IBM SPSS Clementine would be better for the buyer."
"More affordable training for new staff members."
"I rate the tool's pricing a five out of ten."
"The solution is affordable."
"I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five."
"The price of Databricks is reasonable compared to other solutions."
"I would rate Databricks' pricing seven out of ten."
"The basic version of this solution is now open-source, so there are no license costs involved. However, there is a charge for any advanced functionality and this can be quite expensive."
"There are different versions."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"The solution is a good value for batch processing and huge workloads."
"In this day and age, we want to get things done quickly. So, we go to the AWS Marketplace."
"The cost of running Teradata is quite high, but you get a good return on investment."
"Teradata is expensive, so it's typically marketed to big customers. However, there have been some changes, and Teradata is now offering more flexible pricing models and equipment leasing. They've added pay-as-you-go and cloud models, so it's changing, but Teradata is generally known as an expensive high-end product."
"Price is quite high, so if it is really possible to use other solutions (e.g. you do not have strict requirements for performance and huge data volumes), it might be better to look at alternatives from the RDBMS world."
"I am using the free version of Teradata."
"I rate the product price a nine on a scale of one to ten, where one is cheap and ten is expensive."
"Teradata is currently making improvements in this area."
"It comes at a notably high cost for what it offers."
report
Use our free recommendation engine to learn which Data Science Platforms solutions are best for your needs.
813,418 professionals have used our research since 2012.
 

Comparison Review

it_user232068 - PeerSpot reviewer
Aug 5, 2015
Netezza vs. Teradata
Original published at https://www.linkedin.com/pulse/should-i-choose-net Two leading Massively Parallel Processing (MPP) architectures for Data Warehousing (DW) are IBM PureData System for Analytics (formerly Netezza) and Teradata. I thought talking about the similarities and differences…
 

Top Industries

By visitors reading reviews
Financial Services Firm
16%
University
10%
Computer Software Company
9%
Educational Organization
8%
Financial Services Firm
16%
Computer Software Company
12%
Manufacturing Company
9%
Healthcare Company
6%
Financial Services Firm
25%
Computer Software Company
11%
Manufacturing Company
8%
Healthcare Company
7%
 

Company Size

By reviewers
Large Enterprise
Midsize Enterprise
Small Business
 

Questions from the Community

What do you like most about IBM SPSS Statistics?
The software offers consistency across multiple research projects helping us with predictive analytics capabilities.
What is your experience regarding pricing and costs for IBM SPSS Statistics?
While the pricing of the product may be higher, the accompanying service and features justify the investment. However...
What needs improvement with IBM SPSS Statistics?
In some cases, the product takes time to load a large dataset. They could improve this particular area.
Which do you prefer - Databricks or Azure Machine Learning Studio?
Databricks gives you the option of working with several different languages, such as SQL, R, Scala, Apache Spark, or ...
How would you compare Databricks vs Amazon SageMaker?
We researched AWS SageMaker, but in the end, we chose Databricks. Databricks is a Unified Analytics Platform designe...
Which would you choose - Databricks or Azure Stream Analytics?
Databricks is an easy-to-set-up and versatile tool for data management, analysis, and business analytics. For analyti...
Comparing Teradata and Oracle Database, which product do you think is better and why?
I have spoken to my colleagues about this comparison and in our collective opinion, the reason why some people may d...
Which companies use Teradata and who is it most suitable for?
Before my organization implemented this solution, we researched which big brands were using Teradata, so we knew if ...
Is Teradata a difficult solution to work with?
Teradata is not a difficult product to work with, especially since they offer you technical support at all levels if ...
 

Comparisons

 

Also Known As

SPSS Statistics
Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash
IntelliFlex, Aster Data Map Reduce, , QueryGrid, Customer Interaction Manager, Digital Marketing Center, Data Mover, Data Stream Architecture
 

Learn More

Video not available
 

Overview

 

Sample Customers

LDB Group, RightShip, Tennessee Highway Patrol, Capgemini Consulting, TEAC Corporation, Ironside, nViso SA, Razorsight, Si.mobil, University Hospitals of Leicester, CROOZ Inc., GFS Fundraising Solutions, Nedbank Ltd., IDS-TILDA
Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware
Netflix
Find out what your peers are saying about Databricks, Microsoft, Knime and others in Data Science Platforms. Updated: October 2024.
813,418 professionals have used our research since 2012.