Our use case is a typical data warehouse. We just use the data warehouse for reporting and the storage of data. Our users are the staff team who do the reporting and data analysis.
Sr DBA/ DBA Tech Lead at a non-profit with 1,001-5,000 employees
A scalable unified analytics platform with good performance
Pros and Cons
- "The feature I like best is performance. We use Red Tool and Red Job for the data warehouse and reporting. It's perfect. Performance is good, and it can return ad hoc queries very quickly. Of course, it's a cluster, so it's easy to scale."
- "It's hard to make it slow for a small data volume. For large volumes, it's hard to make it work. It's also hard to make it faster, and to make it scale."
What is our primary use case?
What is most valuable?
The feature I like best is performance. We use Red Tool and Red Job for the data warehouse and reporting. It's perfect. Performance is good, and it can return ad hoc queries very quickly. Of course, it's a cluster, so it's easy to scale.
What needs improvement?
It's hard to make it slow for a small data volume. For large volumes, it's hard to make it work. It's also hard to make it faster, and to make it scale.
For how long have I used the solution?
I have been using Vertica for about five years.
Buyer's Guide
Vertica
December 2024
Learn what your peers think about Vertica. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
831,265 professionals have used our research since 2012.
What do I think about the stability of the solution?
It's a stable solution.
What do I think about the scalability of the solution?
Vertica is a scalable solution.
How are customer service and support?
Technical support is good, and they react quickly.
How was the initial setup?
The initial setup is okay. You will need some knowledge and some training. I'd say learning takes a couple of months. We use one person to maintain the database side. With the DevOps team, everyone has a different role. But for our database, it's just one person.
What's my experience with pricing, setup cost, and licensing?
The price is reasonable. We use a pay per license model. Firstly, you need to buy a license. After that, you mainly pay the annual support fee of around 20% or 25%. I think their prices are quite reasonable.
What other advice do I have?
We tried to use data lake kind of stuff for machine learning, but for the key functionality of the data warehouse, it's great. Personally, I feel they are over-marketing the machine learning feature and for something like the semi-structured data. But for the data warehouse, it's truly a good solution. I want to recommend it highly.
I would tell potential users that it's hard to make it slow for small data volumes. For large volumes, it's hard to make it work, make it faster, and make it scale. Depending on your workload and your use case, you need to first purchase the Red Tool. After that, you need to follow the best practices to have an efficient design.
On a scale from one to ten, I would give Vertica a nine.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sr. SW Engineer - Databases with 201-500 employees
Easy to implement, by tuning the model (projection design) you get great performance
Pros and Cons
- "Vertica enabled us to close large deals. Customers with large data sets had to be migrated from PostgreSQL to Vertica due to performance."
- "Performance of management of metadata layer (database catalog) needs improvement. We still have to have smaller customers on PostgreSQL; Vertica cannot manage thousands of schemata."
- "Suboptimal projection design causes queries to not scale linearly."
- "Metadata for database files scale okay, but metadata related to tables/columns/sequences must be stored on all nodes."
How has it helped my organization?
It enabled delivery of a new Agile Data Warehousing Service.
It enabled us to close large deals. Customers with large data sets had to be migrated from PostgreSQL to Vertica due to performance.
What is most valuable?
- Clustered database
- Horizontal scaling
- Disaster recovery
- Columnar Storage
- Compression (you read only columns you need)
- Immutable storage
- Fast ingesting
What needs improvement?
Performance of management of metadata layer (database catalog) needs improvement. We still have to have smaller customers on PostgreSQL; Vertica cannot manage thousands of schemata.
Query performance: Improve either Database Designer (automation of projection design) or performance of queries using suboptimal projection design.
Scaling of execution independently on storage: Upcoming Eon Mode (now Beta in Amazon) will hopefully solves this.
For how long have I used the solution?
One to three years.
What do I think about the stability of the solution?
Encountered stability issues three times during last three years.
What do I think about the scalability of the solution?
Suboptimal projection design causes queries to not scale linearly.
The metadata layer does not scale linearly.
Metadata for database files scale okay, but metadata related to tables/columns/sequences must be stored on all nodes.
How are customer service and technical support?
I have experience with legacy vendors of enterprise RDBMS solutions, and I rate Vertica support to be much better.
Which solution did I use previously and why did I switch?
In my current company I was not responsible for the switch. As far as I know, they switched from PostgreSQL, especially because of performance of analytical queries processing large data.
How was the initial setup?
Just getting Vertica running is straightforward. However, with an increasing number of customers, we had to develop our own tooling. For example:
- Automated deployment
- Monitoring, alerting
- Backup/restore.
What's my experience with pricing, setup cost, and licensing?
Start with license per 1TB. Starting from hundreds of TB there is unlimited licensing to be considered.
Move historical data to HDFS/S3 which are significantly cheaper or even free.
Vertica is delivering more and more features to support load/unload for external storages.
Which other solutions did I evaluate?
2012 - Detailed evaluation including benchmarks of: Greenplum, Vectorwise.
2017 - Evaluation of features and initial communication with vendors, if needed, for: Greenplum, EXASOL, Amazon Redshift, Spark, SAP HANA, IBM dashDB, Snowflake, Azure SQL.
What other advice do I have?
It is easy to implement this solution for one customer. By tuning the model (projection design) you get incredible performance. You won’t face issues with metadata (catalog) layer up to tens of thousands of tables.
It can be a challenge to operate clusters for many customers with varied data pipelines. Consider using Database Designer.
Don't hesitate to push Vertica (through support/product management) to improve it.
Consider implementing your own tools to automate performance tuning tasks.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner.
Buyer's Guide
Vertica
December 2024
Learn what your peers think about Vertica. Get advice and tips from experienced pros sharing their opinions. Updated: December 2024.
831,265 professionals have used our research since 2012.
Database Admin at a tech services company with 1,001-5,000 employees
Replication is the main feature for my use.
Valuable Features
Replication
Improvements to My Organization
Replication and Node recovery in 8.0.
Room for Improvement
vbr.py needs to be improve to support diff no of nodes source to target.
Use of Solution
5 years
Deployment Issues
No
Stability Issues
Yes
Scalability Issues
No
Customer Service and Technical Support
Customer Service:
8
Technical Support:8/10
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Senior Solution Architect at Grupo Cimcorp
Powerful tool, excellent revision options, with easy depolyment
Pros and Cons
- "I enjoy the cybersecurity and backup features."
- "The biggest problem is the cost of cloud deployment."
What is our primary use case?
The primary use case is machine learning and currently, I am working on IOT projects.
How has it helped my organization?
With Vertica, I am able to make changes using other Vertica features and I do not have to start the project over. The Vertica tool is very powerful but you cannot purchase the product based on individual features.
What is most valuable?
I am highly trained on Vertica and I am resistant to using other products because I do not have experience with those products. Some of the most valuable features are cybersecurity and backup.
What needs improvement?
The biggest problem is the cost of cloud deployment.
For how long have I used the solution?
I have worked with Vertica for the past five years.
How was the initial setup?
The initial setup is straightforward and easy deployment.
What other advice do I have?
I would rate Vertica an eight out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Other
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Staff Dev Lead - Analytics Data Storage at a tech services company with 1,001-5,000 employees
Our typical run time for a query is now measured in seconds not hours.
Pros and Cons
- "The extensibility and efficiency provided by their C++ SDK."
- "Whatever's out, the core is not always as great as the engine, especially their first version."
What is most valuable?
Two of them:
- The core feature, meaning their highly efficient columnar file format and execution engine along with a great coverage of ANSI SQL, provides our analysts with a highly expressive and performing platform.
- The extensibility and efficiency provided by their C++ SDK.
How has it helped my organization?
Before Vertica, we used a combination of sharded RDBMSs and Hive: the typical runtime for a query was in the hours. It's now in the seconds, with way
more data than then (we're talking hundreds of terabytes).
What needs improvement?
Whatever's out, the core is not always as great as the engine, especially their first version. That's true, for example, for the Kafka or Hadoop integration.
But they're getting better release after release.
For how long have I used the solution?
Four years.
What do I think about the stability of the solution?
Vertica's code, being designed to use the hardware at its maximum, is very sensitive to low level changes such as kernel bumps or GLibC upgrades. It's also important to do tests on the storage layer (RAID controller + disks).
What do I think about the scalability of the solution?
The default layout (all nodes running spread) introduces latencies in query planning when you reach about 60 nodes, in our experience. Switching to a large cluster (one control node per rack) would be advised, way before reaching the 128 nodes hard limit.
How are customer service and technical support?
It's really great. One of the best I had to deal with. They also assisted us during the development phase of the custom components we've designed.
Which solution did I use previously and why did I switch?
Not really in the same area (MPP databases). However, we ran benchmarks back then against a bunch of competitors and Vertica was one of the fastest, while
being relatively cheap and able to accommodate our hardware.
How was the initial setup?
The setup per se was pretty straightforward. However, it took us some time to design the most efficient loading pattern from Hadoop.
What's my experience with pricing, setup cost, and licensing?
Nothing to advise really; try it out first, it's free up to three nodes and 1TB, and then get in contact with their sales guys.
Which other solutions did I evaluate?
We did evaluate mostly SAP HANA and SQL Server PDW back in 2013, along with a bunch of OSS solutions.
What other advice do I have?
If you plan to use Vertica for different workloads (in term of IO patterns, query frequency, dataset structure) plan to split your clusters: the mother of all cluster patterns becomes quite difficult to manage at some point. We today have around 20 clusters for different usages.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Database Administrator (DBA) at a computer software company with 501-1,000 employees
I liked the auto-distribution to all nodes for fault tolerance and query performance.
What is most valuable?
The auto-distribution to all nodes for fault tolerance and query performance was pretty amazing.
How has it helped my organization?
Our data warehouse at the time was a multi-terabyte PostgreSQL cluster. It worked really well, but we wanted to increase the size to many TB's and our due to our query and loading patterns we found greater performance from Vertica's multi-node warehouse.
What needs improvement?
In the versions I worked with, if a majority of the nodes were being loaded under heavy, sustained rates the nodes would see some dramatic decreases in performance due to the data shuffling that needed to occur between all the nodes. To work around that we ended up doing most of the loading in one or two nodes and that helped significantly.
The synchronizations problems occurred when loading about 10 billion events, at a rate of about 100k tuples/second/node across 5 nodes. One of the suggestions from Vertica engineering was to increase the number of nodes to offset how much data was being sync'd per node.
For how long have I used the solution?
Extensive use of Vertica 5 as a production datawarehouse, and a POC for a client.
What was my experience with deployment of the solution?
In earlier versions Vertica, it could sometimes be a pain to install on multiple nodes. In the most recent versions most of that pain has been fixed. Stability in earlier versions was compromised at times when the majority of the nodes were under heavy write loads.
How are customer service and technical support?
The service and support from Vertica was excellent. Every tech and sales rep I dealt with was very responsive, pleasant, and helped me solve any engineering issues we ran into in very short order.
Which solution did I use previously and why did I switch?
I have used Greenplum and Postgres extensively. The latter is an excellent general-purpose database and is entirely suitable for most data needs, however Vertica works really well in cases where you are storing and querying a lot of data that can be compressed and stored in columnar format, and you need your data auto-balanced across many nodes.
How was the initial setup?
The installation procedure was reasonably straightforward, but earlier versions of Vertica were a bit more tricky due to libraries and dependencies. The docs were unclear in a few places during the installation, particularly with OS' that were not fully compatible with the required libraries. I expect those issues have been resolved in the newest version (8 at this time).
What about the implementation team?
Implementation was done in-house, with excellent support from the Vertica engineers.
What other advice do I have?
My advice is to clearly define your expectations, and benchmark performance in real-world-like environments. If you expect to be executing 100 queries per second and loading 10 million tuples per minute, then test that, and test several times that so you collect measurements about where the system is liable to break down.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Scientist at a media company with 501-1,000 employees
The fact that it is a columnar database is valuable. Columnar storage has its own benefit with a large amount of data.
What is most valuable?
The fact that it is a columnar database is valuable. Columnar storage has its own benefit with a large amount of data. It's superior to most traditional relational DB when dealing with a large amount of data. We believe that Vertica is one of the best players in this realm.
How has it helped my organization?
Large-volume queries are executed in a relatively short amount of time, so that we could develop reports that consume data in Vertica.
What needs improvement?
Speed: It's already doing what it is supposed to do in terms of speed but still, as a user, I hope it gets even faster.
Specific to our company, we do store the data both in AWS S3 and Vertica. For some batch jobs, we decided to create a Spark job rather than Vertica operations for speed and/or scalability concerns. Maybe this is just due to the computation efficiency between SQL operations vs. a programmatic approach. Even with some optimization (adding projections for merge joins and grouped by pipelined), it's still taking a longer time than a Spark job in some cases.
For how long have I used the solution?
I have personally used it for about 2.5 years.
What do I think about the stability of the solution?
I have not recently encountered any stability issues; we have good health checks/monitoring around Vertica now.
What do I think about the scalability of the solution?
I have not encountered any scalability issues; I think it's scalable.
How are customer service and technical support?
N/A; don't have much experience on this.
Which solution did I use previously and why did I switch?
We do have some pipelines accessing raw data directly and process it as a batch Spark job. Why? I guess it's because the type of operations we do can be done easily in code vs. SQL.
What other advice do I have?
I would recommend using Vertica for those people/teams having large denormalized fact tables that need to be processed efficiently. I worked around optimizing the query performance dealing with projections, merge joins and groupby pipelines. It paid off at the end.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Architect at a tech services company with 51-200 employees
Vertica allows for thousands of users to run an analysis at the same time. Great aggressive compression.
At the tech company I work , we were looking for new ways to allow end users (a couple of thousand external users) to crunch through their detailed data in real time as well as enabling internal users and data analysts to gain the information they needed to run and optimize their business processes.
Unfortunately our current system had become slower and slower over time due to the tremendous increase in data to be managed so a new approach had to be taken to accomplish this goal. Our existing data warehouse/data management infrastructure just could not handle big data.
We evaluated a variety of different solutions such as Amazon Redshift, Infobright and Microsoft. Vertica won out above all these other solutions. Our dataset is several hundred million rows and our avg. response time goal was less than 5 secs. We are building our environment for the future so another requirement was to be able to scale horizontally.
Redshift came close in response time but failed in concurrency, meaning multiple users running an analysis at the same time. Infobright came close in response time and concurrency but didn’t provide sufficient scalability. Vertica checked all boxes at a very competitive price-point.
We found that the extreme speed, performance and flexibility is superior to all the other solutions out there. The massive scalability on industry-standard hardware, standard SQL interface and database designer and administration tools are excellent features of Vertica. I also really value the simplicity, concurrency for hundreds or thousands of users, and aggressive compression.
This new environment allowed us to implement applications such as clickstream and predictive analysis which have added tremendous value for us. Currently there is about 500 GB – 1 TB of data that I am managing and I have found that Vertica is able to be integrated very well with a variety of Business Intelligence (BI), visualization, and ETL tools in their environment. I use Hadoop, Tableau and Birst and using all these solutions with Vertica has been overall quite smooth.
Our query performance has increased by 500 – 1,000% through improvements in response time and I am now able to compress our data by more than 50%. The simultaneous loading and querying and aggressive compression has helped us become more efficient and productive. Furthermore the high availability without hardware redundancy, optimizer and execution engine, and high availability for analytics systems has saved us both time and money.
Disclosure: PeerSpot has made contact with the reviewer to validate that the person is a real user. The information in the posting is based upon a vendor-supplied case study, but the reviewer has confirmed the content's accuracy.
Buyer's Guide
Download our free Vertica Report and get advice and tips from experienced pros
sharing their opinions.
Updated: December 2024
Popular Comparisons
Snowflake
Teradata
Oracle Exadata
VMware Tanzu Data Solutions
Apache Hadoop
SAP BW4HANA
IBM Netezza Performance Server
Oracle Database Appliance
SAP IQ
Yellowbrick Data Warehouse
Buyer's Guide
Download our free Vertica Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which is the best RDMBS solution for big data?
- What is the biggest difference between Amazon Redshift and Vertica
- Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza
- When evaluating Data Warehouse solutions, what aspect do you think is the most important to look for?
- At what point does a business typically invest in building a data warehouse?
- Is a data warehouse the best option to consolidate data into one location?
- What are the main differences between Data Lake and Data Warehouse?
- Infobright vs. Exadata vs. Teradata vs. SQL Server Data Warehouse- which is most compatible with front end tools?
- What is the best data warehouse tool?
- Which Data Strategy solution have you used?
It seems you were mainly focused on how Vertica is good and did not run a benchmark, otherwise it could be nice if you could publish loading and query performance between all above DB's. 500GB - 1TB is not a lot of data .