IT Central Station’s crowdsourced user review platform helps technology decision makers around the world to better connect with peers and other independent experts who provide advice without vendor bias.
You can read user reviews for the top cloud data warehouse tools of Q1 2017 here, to help you decide which solution is best for you.
In the review excerpts below, our users have ranked their cloud data warehouse solutions according to their valuable features, and share where they see room for improvement.
#1 HPE Vertica
HPE Vertica is ranked as the number one cloud data warehouse solution of Q1 2017 by our users -- but what do they really think about the solution?
A Powerful Database Specialized for Data Warehouses
“Vertica is an excellent data warehouse platform”, writes Victor Di Leo, Vertica Support Engineer at a marketing services firm with 1,001-5,000 employees.
“Its column-oriented architecture makes it a powerful database specialized for data warehouses. Data should be designed around a star schema.
Data is accessed via SQL, which most developers are already familiar with.”
Lacks Native Stored Procedures or a Native Scripting Language
Di Leo also adds that “Vertica does not have native stored procedures or a native scripting language. Instead, external functions (which can be called from within Vertica) using Java, C++, Linux shell scripting, etc., are supported.
This is an unpleasant surprise for many developers, but I feel this has not been a big hindrance in my experience. Complex business logic probably does not belong in a high-performance data warehouse platform. Rather, this should be taken care of during ETL.”
#2 Amazon Redshift
IT Central Station users rank Amazon Redshift as the number two cloud data warehouse solution of Q1 2017.
Stores Over 500TB of Data
Aju Mathew, Director at a tech company with 1,001-5,000 writes about Amazon Redshift’s petabyte-scale data warehouse, sharing the experience of a customer:
“One of our existing customers stores more than 500 terabytes of data in an AWS Redshift database and the warehouse performance was good.
We want to highlight that even if the warehouse size increases to petabytes, Redshift would still work fine and there wouldn’t be any performance issues and would cost less also.”
Snapshot Restoring with Large Datasets
“Of course, every product has pluses and minuses”, writes Pamanesh NC, a Big Data Solution Architect - Spatial Data Specialist at Sciera, Inc.
“From that perspective, Amazon Redshift has some issues with snapshot restoring when we handle huge datasets”, he reports.
“When our snapshot size is really huge, like 20 TB+, we are forced to wait a long time to get it restored. This is reasonable, as they need to transfer the entire dataset to the cluster.
My thought on this issue is that Amazon has their own data centers and they are connecting each region of storage through Direct Connect. The input and output network data transfer might not be a complex thing.
For example, if they used 10 Gbps network transfer, they can transfer 1TB in less than two minutes, but that’s not happening now. To restore 1TB of data, it takes more than 30-40 minutes.”
#3 IBM dashDB
IBM dashDB is ranked as the number three cloud data warehouse solution by our users during Q1 2017.
Built on DB2 Technology
“I like that the dashDB solution is built on DB2 technology. This means that you can use all the features of a DB2 database, but outsource all the hardware and software maintenance”, shares a consultant at a tech services company with 51-200 employees.
“Cloud solutions/services are the big thing for the future. IBM has provided users with the ability to store data in the cloud with the option to use the database as a MPP server.”
Not Auto-Scalable
Shailender Gupta, Client Engagement Manager at a tech services company with 51-200 employees, focuses on auto-scaling as an area that IBM dashDB could improve on:
“One of the biggest advantage of cloud computing is auto-scaling. AWS Redshift allows you to add more storage and CPUs/instances without any significant downtime.
With dashDB, scalability and uptime need more improvement. dashDB is not auto-scalable as of now. For any addition of space or computing power, we have to raise a request and there is a downtime to upgrade the instance.
The pay-as-you-go model is missing. You must buy a minimum instance of CPU and storage to begin with.”
#4 SAP Business Warehouse
IT Central Station users rank SAP Business Warehouse as the number four cloud data warehouse solution of Q1 2017.
Cross-Organizational Communication
“It has helped our financial transactions and the HR-related functions to communicate across our organization”, says an SAP BW/BOBJ Solution Architect/Developer/Administrator at a non-tech company with 1,001-5,000 employees.”
He explains further:
“All our processes such as funds management, grant, payroll transactions are processed automatically through this system and BW helps to collect the data for analytic needs of the department users and board members.”
User-Friendliness
“The big minuses are it’s not being user friendly”, writes Engin Isik, SAP BI Systems Supervisor at a maritime company with 1,001-5,000 employees.
“The user interface and it’s reporting tool called BEx have room for improvement. SAP Global always promotes Business Objects for visualizing.”
Read more of the latest cloud data warehouse reviews on IT Central Station.
Hey,
My review is in
www.itcentralstation.com
My suggested solution is that, our redshift cluster should be online or up all the time. Better we can goahead some kind of reserved instances. But still that will be more expensive for small level companies who are using Redshift for timely manner (Short Period) like plug and play methodology. Pay for the usage. AWS has to comeup with somewhat better solution for that. Anyway they announced Redshift Spectrum as well recently. That will solve some part the problem in terms of Analytical processes.
@Orlee Gillis. Ofcourse. I accept your point, every product has plus and minus. But recently What I found in Amazon Redshift is that, They have some problem in IP Pooling. They are starting the cluster based on FIFS (First IN First Serve). So when the number of customer increasing and starting their clusters in same period or time, they couldn't able to serve ip for them to start the cluster. Apparently User/Customer need to wait to get the IP to start the cluster. Sometime that is taking more than 10 hrs too. That's paining a lot sometime when we are sitting in the delivery phase or something.