We're just now getting into Vertica, but it allows us to store and access big data very quickly. It comes down to being able to quickly identify where the root cause analysis is and where trends are, so you can actually try to almost predict where problems are before they really become a problem.
Senior Product Manager (Data Infrastructure) and Security Researcher at a tech company
Great Platform
If you are in the topic of Databases, you should know who is Dr. Michael Stonebraker, who is right now an adjunct professor in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, considered like one of the world experts in this field. Why I began in that way? Because Dr. Stonebraker co-founded Vertica Systems, seeing the innovation behind this amazing product.
But, What is Vertica?
Vertica Analytic Database is a high performance MPP (Massive Parallel Processing) columnar engine optimized to deliver faster query results in the shortest time. I said optimized because this is a keyword inside the Vertica team: every piece of code in Vertica has a lot of research and innovation, which I will discuss later. I heard abut this database when I was writing a research paper for my organization about MPP systems, and I found that Vertica was one of the good players in this Big Data Analytics game (the other good players are the Greenplum Database and Teradata’s Aster Data Platform). Then, HP saw the great opportunity that this product represented for the Big Data business and acquired the company in 2011.
OK, let’s talk now about some of the Vertica’s features
-
Column-based storage:
Vertica use a patented architecture called FlexStoreTM, created based on three principles: the grouping of multiples columns in a single file, the selection of disk storage format based on data load patterns automatically, and the ability to differentiate storage media by their performance characteristics and to enable intelligent placement of data based on usage patterns
-
Advanced Data compression:
Based on the choosed architecture by Vertica team of grouping columns in a single file; the data compression follows the same principle: Vertica organizes values of similar data types contiguously in memory and on disk, enabling to select the best compression algorithm depending of the data type. This improves dramatically the query execution and parallel load times
-
Built-in Analytics functions:
Vertica comes with a completed packages of useful functions for Analytics, divided by topics like Natural Language Processing, Data Mining, Logistic Regression, etc. This is called User-Defined Extensions. You can read more about this here in this whitepaper
-
Automatic High Availability:
Vertica allows to scale your data almost without limits, with remarkable features like automatic failover and redundancy, fast recovery, and fast query performance, executing queries 50x-1000x faster eliminating costly disk I/O.
-
Native integration with Hadoop, BI and ETL tools:
Seamless integration with a robust and ever growing ecosystem of analytics solutions.
You can read deeply about all these features here. The last version of the platform is Vertica 6, and here you can find some of the new features and improvements in this version, or you can view this video, where Luis Maldonado, Director of Product Management at Vertica, explaining a quick overview of this version.
Ok, it’s a great platform, but who are using it today?
There are a lot of companies that are trusting in Vertica today: Twitter, Zynga (th number # 1 company in the Social Gaming industry), Groupon, JPMorgan Chase, Mozilla, AT&T, Verizon, Diio, Capital IQ, Guess Inc,and many more. Read its testimonials here.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
We can quickly identify with the root cause analysis where trends are.
Valuable Features:
Improvements to My Organization:
The ability to access in-store, big data, and be able to create keywords for faster resolution and look up an individual, hey we did this problem before. It'll show you all the steps and everything, along with different products. Vertica is pretty much the database behind it. It really does the performance aspect of it.
Room for Improvement:
I guess really the only thing there is if you get a server big enough to handle Vertica, it does just fine. If you're working in a small business, it will tend to overtake most of their budget from a cost perspective because you need so many servers, so much storage, to be able to handle all that stuff.
Stability Issues:
It's very stable.
Initial Setup:
We had no issues deploying it.
Other Solutions Considered:
I did not really look at any competition. Basically, it's like I said, we're an HP shop and a lot of their applications are going to a Vertica database for its storage and processing of data. We were doing a lot of Oracle, but Oracle was actually moving towards Vertica in our environment.
Other Advice:
Make sure you understand how much data that you're going to be incorporating into the big data, so you can actually define the amount of storage and redundant storage appropriately.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Vertica
November 2024
Learn what your peers think about Vertica. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
824,067 professionals have used our research since 2012.
Sr. SW Engineer - Databases with 201-500 employees
Easy to implement, by tuning the model (projection design) you get great performance
Pros and Cons
- "Vertica enabled us to close large deals. Customers with large data sets had to be migrated from PostgreSQL to Vertica due to performance."
- "Performance of management of metadata layer (database catalog) needs improvement. We still have to have smaller customers on PostgreSQL; Vertica cannot manage thousands of schemata."
- "Suboptimal projection design causes queries to not scale linearly."
- "Metadata for database files scale okay, but metadata related to tables/columns/sequences must be stored on all nodes."
How has it helped my organization?
It enabled delivery of a new Agile Data Warehousing Service.
It enabled us to close large deals. Customers with large data sets had to be migrated from PostgreSQL to Vertica due to performance.
What is most valuable?
- Clustered database
- Horizontal scaling
- Disaster recovery
- Columnar Storage
- Compression (you read only columns you need)
- Immutable storage
- Fast ingesting
What needs improvement?
Performance of management of metadata layer (database catalog) needs improvement. We still have to have smaller customers on PostgreSQL; Vertica cannot manage thousands of schemata.
Query performance: Improve either Database Designer (automation of projection design) or performance of queries using suboptimal projection design.
Scaling of execution independently on storage: Upcoming Eon Mode (now Beta in Amazon) will hopefully solves this.
For how long have I used the solution?
One to three years.
What do I think about the stability of the solution?
Encountered stability issues three times during last three years.
What do I think about the scalability of the solution?
Suboptimal projection design causes queries to not scale linearly.
The metadata layer does not scale linearly.
Metadata for database files scale okay, but metadata related to tables/columns/sequences must be stored on all nodes.
How are customer service and technical support?
I have experience with legacy vendors of enterprise RDBMS solutions, and I rate Vertica support to be much better.
Which solution did I use previously and why did I switch?
In my current company I was not responsible for the switch. As far as I know, they switched from PostgreSQL, especially because of performance of analytical queries processing large data.
How was the initial setup?
Just getting Vertica running is straightforward. However, with an increasing number of customers, we had to develop our own tooling. For example:
- Automated deployment
- Monitoring, alerting
- Backup/restore.
What's my experience with pricing, setup cost, and licensing?
Start with license per 1TB. Starting from hundreds of TB there is unlimited licensing to be considered.
Move historical data to HDFS/S3 which are significantly cheaper or even free.
Vertica is delivering more and more features to support load/unload for external storages.
Which other solutions did I evaluate?
2012 - Detailed evaluation including benchmarks of: Greenplum, Vectorwise.
2017 - Evaluation of features and initial communication with vendors, if needed, for: Greenplum, EXASOL, Amazon Redshift, Spark, SAP HANA, IBM dashDB, Snowflake, Azure SQL.
What other advice do I have?
It is easy to implement this solution for one customer. By tuning the model (projection design) you get incredible performance. You won’t face issues with metadata (catalog) layer up to tens of thousands of tables.
It can be a challenge to operate clusters for many customers with varied data pipelines. Consider using Database Designer.
Don't hesitate to push Vertica (through support/product management) to improve it.
Consider implementing your own tools to automate performance tuning tasks.
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner.
Consultant at a tech services company with 10,001+ employees
All joint operations were enhanced by creating identically segmented projections
Pros and Cons
- "I like the projection feature, which increases query performance."
- "Limitations in group by projections is where I would like to see an improvement."
What is most valuable?
- I found the columnar storage, which increases performance of sequential record access, to be the most valuable feature.
- I also like the projection feature, which increases query performance.
How has it helped my organization?
- The workload on our ETL tools were reduced.
- All joint operations were enhanced by creating identically segmented projections.
What needs improvement?
Limitations in group by projections is where I would like to see an improvement.
What was my experience with deployment of the solution?
We have not had any issues with deployment.
What do I think about the stability of the solution?
We have not had any issues with stability.
What do I think about the scalability of the solution?
We have been able to scale it for our needs.
What other advice do I have?
It is a good database that can be used for ad hoc queries as well as analytical queries.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Software and Data Architect at a computer software company with 1,001-5,000 employees
The concurrency got better in this version and we are able to run more queries and load concurrently.
Valuable Features
The compute and processing engine returns the queries fast and let us use our analysis resources in a better utilization.
The concurrency got better in this version and we are able to run more queries and load concurrently.
Improvements to My Organization
We built an internal dashboard using the MicroStrategyto increase visibility to our management and our employees. Also, we built tool to expose the data to our selected partners and users to create better engagement with our platform.
Room for Improvement
- Loading times for “real time” sources - for example, loading from Spark creates a load on the DB at high scale
- Connectors to other sources such as Kafka or AWS Kinesis
- Better monitoring tools
- Better integration with cloud providers - we were missing some documentation regarding running Vertica on AWS
Use of Solution
We've been using Vertica for a year.
Stability Issues
In case of one HD failure in the cluster, the entire cluster got slower. We feel that it should be able to handle such issues.
Scalability Issues
No.
Customer Service and Technical Support
The support was slow and didn’t provide a solution in most cases. The community proved to be the better source for knowledge and problem solving.
Initial Setup
Pretty straightforward, the installation was simple and we added more nodes easily as we grew.
Pricing, Setup Cost and Licensing
Vertica is pretty expensive, take into account the servers and network costs before committing.
Other Solutions Considered
We evaluated both AWS Redshift and Google BigQuery.
Redshift didn’t fulfill our expectations regarding query latency at high scale (over 60 TB). Regarding BigQuery, we found the pricing structure pretty complex (payment per query and data processed) and harder to control.
Other Advice
Don't plan a production usage on high-scale straight on Vertica, use caching or other buffers between the users and the DB. Get yourself familiar with the DB architecture before planing your model (specifically, make sure you know ROS/WOS and projections). Try to avoid LAP before your schema gets stabilized.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Senior Data Architect at a media company with 1,001-5,000 employees
Having the ability invoke analytic functions without having write self join SQL statements is beneficial.
Valuable Features:
Analytic functions.
Improvements to My Organization:
We are trying to data mine customer event data. Having the ability invoke analytic functions without having write self join SQL statements ... just brilliant.
Room for Improvement:
Ability to use analytic functions in where clauses, being able to use aliases in the where and order by clauses will make query writing/reading a lot easier.
Use of Solution:
2 years.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Sr. Developer, Big Data at a comms service provider with 51-200 employees
The most valuable feature for me is the columnar data store.
Valuable Features:
Columnar data store
Room for Improvement:
Add geospatial indexes (sounds like they have done it in version 8.0)
Deployment Issues:
No
Stability Issues:
No
Scalability Issues:
No
Customer Service:
Above average
Initial Setup:
Setup was very simple
Disclosure: My company has a business relationship with this vendor other than being a customer: We are partners with HPE
Senior Solution Architect at Grupo Cimcorp
Powerful tool, excellent revision options, with easy depolyment
Pros and Cons
- "I enjoy the cybersecurity and backup features."
- "The biggest problem is the cost of cloud deployment."
What is our primary use case?
The primary use case is machine learning and currently, I am working on IOT projects.
How has it helped my organization?
With Vertica, I am able to make changes using other Vertica features and I do not have to start the project over. The Vertica tool is very powerful but you cannot purchase the product based on individual features.
What is most valuable?
I am highly trained on Vertica and I am resistant to using other products because I do not have experience with those products. Some of the most valuable features are cybersecurity and backup.
What needs improvement?
The biggest problem is the cost of cloud deployment.
For how long have I used the solution?
I have worked with Vertica for the past five years.
How was the initial setup?
The initial setup is straightforward and easy deployment.
What other advice do I have?
I would rate Vertica an eight out of ten.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Other
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Buyer's Guide
Download our free Vertica Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Popular Comparisons
Snowflake
Teradata
Oracle Exadata
VMware Tanzu Data Solutions
Apache Hadoop
SAP BW4HANA
IBM Netezza Performance Server
Oracle Database Appliance
SAP IQ
Yellowbrick Data Warehouse
Buyer's Guide
Download our free Vertica Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which is the best RDMBS solution for big data?
- What is the biggest difference between Amazon Redshift and Vertica
- Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza
- When evaluating Data Warehouse solutions, what aspect do you think is the most important to look for?
- At what point does a business typically invest in building a data warehouse?
- Is a data warehouse the best option to consolidate data into one location?
- What are the main differences between Data Lake and Data Warehouse?
- Infobright vs. Exadata vs. Teradata vs. SQL Server Data Warehouse- which is most compatible with front end tools?
- What is the best data warehouse tool?
- Which Data Strategy solution have you used?