Valuable Features
- Speed
- Parallelization
- SQL language
- High Availability
Improvements to My Organization
I have seen queries that take over 24 hours on MS SQL Server to complete, complete in less than 10 minutes on Vertica. I have seen queries that take several minutes, up to an hour, on MS SQL Server, complete in less than 10 seconds, sometime less than one second on Vertica. That allows analysts to spend their time analyzing results instead of waiting for results. Certain types of analysis weren’t even possible before, simply because it took too long.
Room for Improvement
While the documentation is very extensive and relatively complete, it’s poorly organized and there are way too few examples. It’s come a long way since the first version I saw, but it still has a long way to go. Plus, there is very little information on the internet. I can find a solution to nearly any MS SQL Server problem using Google. Not so for Vertica.
Use of Solution
I've been using it for five years. I started with version 4, which was prior to the HP acquisition.
Deployment Issues
It’s a breeze to setup if you’re using hardware and an OS that meet the minimum requirements. If you try straying from the recommendations, you can find yourself in trouble.
Stability Issues
If your queries and projections are optimized properly, it’s rare that you’ll run into stability issues. Stability issues are usually caused by improperly configured hardware/OS, or poorly written queries/projections.
Scalability Issues
Scalability is great if you size it correctly to start with. Resizing a cluster isn’t for the faint of heart. All the data needs to be redistributed across the cluster when the cluster size changes, and that can take a very long time, depending on how much data you’re storing.
Customer Service and Technical Support
The technical support for Vertica specifically is great. They still have lots of the original (pre-HP acquisition) support people working there who know the product inside and out.
Initial Setup
It's pretty straightforward to get the cluster up and running - assuming you follow the vendor recommendations closely. Getting your data in, setting up projections, optimizing queries, etc. is not as straightforward. If you’ve never used it before, save yourself hours of frustration and hire a Vertica consultant.
Implementation Team
The first time I used Vertica, we tried doing it ourselves in the beginning. We learned a lot from our failures, but still weren’t getting the results we’d hoped for. After getting professional services help, we were pointed in the right direction, and that made a world of difference. I highly recommend bringing in someone who knows what they’re doing to get you started on the right foot.
Pricing, Setup Cost and Licensing
It’s expensive, but it’s good once you get it working properly. Like any complicated software product, you’re paying for years of research and development, support, etc. Everyone’s use case is different, and sometimes it’s difficult to put a price on speed. You pay for the storage, not the number of processors or nodes. They have a community edition that allows up to three nodes with up to one TB of storage. You can try it out for free that way, and once you realize how well it works, you can purchase a commercial license as your storage footprint grows.
Other Solutions Considered
At a previous company, we looked at Greenplum as an alternative to Vertica. For our specific use-case, Vertica won the majority of our benchmark tests. If we had a design that required lots of updates and deletes, we may have compromised and gone with Greenplum.
Other Advice
How useful it is depends upon your use case. It’s not a be-all and end-all solution, and it’s great for data that doesn’t change. If you have massive fact and dimension tables, and you need to do analytics on them, this is the Cadillac. If you’re trying to replace your OLTP system, there are better suited solutions out there.
These days, there are lots of alternative solutions in the big data space. Open source vs. Commercial. Every imaginable use case. Just like any project, there is the right tool for the job, but you don’t always know what tools are available. You end up using something because it worked before on a different job, or it’s the cheapest solution. Your best bet is always to closely determine your requirements, then find the best match.
Disclosure: I am a real user, and this review is based on my own experience and opinions.