Greenplum is a distributed database that we used for data warehousing.
Senior Data Engineer at a financial services firm with 10,001+ employees
Powerful external data integration and parallel load capabilities, with good technical support
Pros and Cons
- "The parallel load features mean that Greenplum is capable of high-volume data loading in parallel to all of the cluster segments, which is really valuable."
- "The initial setup is somewhat complex and the out-of-the-box configuration requires optimization."
What is our primary use case?
What is most valuable?
The parallel load features mean that Greenplum is capable of high-volume data loading in parallel to all of the cluster segments, which is really valuable.
The service management capabilities are good.
The external data integration with Parquet, Avro, CSV, and unstructured JSON works well.
It has an advanced query optimizer.
What needs improvement?
The initial setup is somewhat complex and the out-of-the-box configuration requires optimization.
- OS settings need to be tuned according to the Install guide.
- Only group/spread mirroring by gpinistsystem, block mirroring is manual (Best Practices Guide)
- Db maintenance scripts are not supplied - some of them added in cloud - need to be implemented based on the Admin Guide.
- Comes with two query optimizers, PQO is default, some queries perform better with the legacy planner, it needs to be set.
For how long have I used the solution?
We have been working with Greenplum for about five years.
Buyer's Guide
VMware Tanzu Data Solutions
October 2024
Learn what your peers think about VMware Tanzu Data Solutions. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
816,636 professionals have used our research since 2012.
What do I think about the stability of the solution?
Greenplum is pretty stable.
What do I think about the scalability of the solution?
This product is absolutely scalable. We have more than 400 users in our database.
How are customer service and support?
The technical support is exquisite.
This is a company that really listens to its customers. I am very happy with our relationship.
Which solution did I use previously and why did I switch?
Before I joined this company, I used different data warehousing solutions.
Making the transition to Greenplum requires a completely different mindset because it is massively parallel. It's more like a Big Data mindset, where you need to consider that you are distributing data between cluster nodes. It is not always straightforward to make the switch.
How was the initial setup?
The initial setup is kind of complex. You need an expert to set up a Greenplum cluster.
It may not be possible to simplify the initial setup because there's an out of the box configuration and you can use it. I've actually seen companies using it for years and it works, but it didn't work optimally so they were not happy with the results.
You can set up Greenplum but you really need to read the manual and the installation guide. I've seen people skipping it and then complaining.
What about the implementation team?
A few people are enough to maintain this product. If you want to have around the clock support then you will need a couple of people in different time zones, but generally, maintenance is straightforward.
What other advice do I have?
We are currently in the process of upgrading from version 5.26 to 6.11 and I can already see a lot of improvements. I can't wait to try them. According to the roadmap, there are a lot of new improvements coming in the V7 version, which is due out next year.
My advice for anybody who is implementing Greenplum is that they really need an expert to assist them. They might hire consultants or grow experts in-house, although that takes time and it is not always straightforward. You can use Greenplum out of the box but to really leverage all of the capabilities, you definitely need to tune your system and also design your database objects.
When people think about a database they usually think about Oracle, Mircosoft SQL, or maybe MySQL. Greenplum is a distributed database that needs a completely different mindset. I think that when people start to use it, they don't really understand. For example, you cannot switch from Oracle to Hadoop because you will need the same change, but when they switch to Greenplum from Oracle, or just put data from Oracle to Greenplum, they don't consider this change as seriously as they would for Hadoop.
Overall, I am very happy with this product.
I would rate this solution a nine out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Solutions Architect at a tech services company with 51-200 employees
Great queues and publishing capabilities with good reliability
Pros and Cons
- "The solution can scale."
- "The availability could be better."
What is our primary use case?
We use the solution for event-driven programming. We have multiple queues and channels to provide scenarios for publishing into containers. You have to communicate the microservices, and consumers consume the services.
How has it helped my organization?
We were using the solution to setting the tenant settings into the service. For example, if you have five microservices using the tenant settings, after updated, we publish the updates to other microservices. It helps get the updated data to be able to publish the settings into the updated queue.
What is most valuable?
The queues and the publishing are quite useful. We're able to create hierarchies and control channels and flows to control what is going from which queue.
The solution can scale.
It is stable and reliable.
What needs improvement?
The availability could be better. When something crashes, a queue gets deleted, and my data is lost. They need to improve this so that we don't lose data during issues like crashes.
We'd like to understand how many queues are running on RabbitMQ. I'm not sure how to get these details and how to verify the information.
We need other protocols.
For how long have I used the solution?
I've been using the solution for three years or so.
What do I think about the stability of the solution?
The solution is stable and reliable. There are no bugs or glitches.
What do I think about the scalability of the solution?
The solution is scalable. However, we have issues with availability.
How are customer service and support?
Sometimes, it is hard to understand what is going on when you reach out to technical support.
What about the implementation team?
Our DevOps team deployed the solution.
What's my experience with pricing, setup cost, and licensing?
I'm not sure what the exact pricing is. I don't handle the licensing aspect.
What other advice do I have?
I am using the latest version of the solution. I'm not sure of the version number.
I've used this on multiple projects, and it has proven to be quite useful.
I'd rate the solution nine out of ten. It is a very good tool.
Which deployment model are you using for this solution?
Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
VMware Tanzu Data Solutions
October 2024
Learn what your peers think about VMware Tanzu Data Solutions. Get advice and tips from experienced pros sharing their opinions. Updated: October 2024.
816,636 professionals have used our research since 2012.
Consultant at a government with 10,001+ employees
Outstanding performance and excellent value for money
Pros and Cons
- "Tanzu Greenplum's most valuable features include the integration of modern data science approaches across an MPP platform."
- "Tanzu Greenplum's compression for GPText could be made more efficient."
What is most valuable?
Tanzu Greenplum's most valuable features include: the integration of modern data science approaches across an MPP platform, including the ability to massively denormalize data and spread it across your MPP segments; the ability to index data and make it searchable, which significantly reduces the need for ETL; its fast performance. Tanzu Greenplum is also very active in providing additional functionalities and software when needed, like GPText indexing of JSON events.
What needs improvement?
Tanzu Greenplum's compression for GPText could be made more efficient.
For how long have I used the solution?
I've been using Tanzu Greenplum for eight years.
What do I think about the stability of the solution?
Tanzu Greenplum's overall performance is outstanding.
What do I think about the scalability of the solution?
Tanzu Greenplum is highly scalable.
How are customer service and support?
VMware's technical support is probably the best I've seen.
How was the initial setup?
The initial setup was very easy.
What's my experience with pricing, setup cost, and licensing?
Tanzu Greenplum's pricing is really competitive and gives excellent value for money. I'd say that the benefits of orchestrated deployment and the features of GPText make the licensed version worth it.
What other advice do I have?
When considering implementing Tanzu Greenplum, I recommend viewing it as a massive lakehouse, not just a data warehouse. I would rate Tanzu Greenplum ten out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Architect Projects at T-Systems International GmbH
Offers very good performance, particularly useful for public institutions with a minimal budget
Pros and Cons
- "A very good, open-source platform."
- "Extra filters would be helpful."
What is our primary use case?
We are using this product as a database for our platform. We are customers of VMware and I'm the project architect.
What is most valuable?
In general, I think this is a very good platform, particularly as it's open source. It fits very well with use cases for public institutions or universities, where there is not always a big budget.
What needs improvement?
I'd like to see more support for structured data and features related to queries on NoSQL keys, extra filters would be helpful.
For how long have I used the solution?
I've been using this solution for two years.
What do I think about the stability of the solution?
We've been working on this project and using this solution for two years and it's performed very well.
What do I think about the scalability of the solution?
The solution is scalable. We have around 20 users working directly on this platform and around 1,000 end users.
Which solution did I use previously and why did I switch?
We also use Cloudera and Oracle so we have a few alternatives. We currently use Greenplum because one of our customers needed an open-source solution that would scale well and after some investigation, we went with Greenplum.
How was the initial setup?
The initial setup was straightforward and we carried out the implementation ourselves.
What's my experience with pricing, setup cost, and licensing?
We're using the open-source version so there are no licensing fees.
What other advice do I have?
I recommend reading as much documentation as possible before starting to use the product. It helps to know how your data model should be implemented in order to get the best out of the platform and to figure out how it can improve performance,
This is one of the best solutions I've used and I rate it eight out of 10.
Which deployment model are you using for this solution?
Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Software Engineer with 1,001-5,000 employees
Allows for a fully asynchronous solution. Using Pivotal Cloud Foundry, we can scale the number of consumers or receivers.
What is most valuable?
Allowing for a fully asynchronous solution is crucial for this particular feature. The seamless nature of creating and connecting to a queue makes it really easy to code and understand. Pivotal Cloud Foundry allows us to easily scale the number of consumers (or receivers) as well. So far, no hiccups have been found with the PCF implementation.
How has it helped my organization?
RabbitMQ allows for asynchronous solutions where previously everything was synchronous.
What needs improvement?
The product works pretty well, but one small thing could be an improvement to the monitoring site. It could be a little bit more modern, instead of postback refreshing, etc.
For how long have I used the solution?
We have been using Rabbit for a while and I started integrating it into the mobile project a few months ago.
What do I think about the stability of the solution?
Every so often, I need to clear out the queue during development. This could be a symptom of something else, but unpacked requests tend to get trapped in the queue at times.
What do I think about the scalability of the solution?
PCF allows us to scale the consumers.
How are customer service and technical support?
I haven't used any technical support yet.
Which solution did I use previously and why did I switch?
To my knowledge, this is the only queuing system my company has used.
How was the initial setup?
Thanks to Pivotal Cloud Foundry, initial setup was straightforward. We simply created a new RabbitMQ service, obtained credentials for the queue and started developing.
Which other solutions did I evaluate?
I personally have not explored other queuing solutions, but have used Akka HTTP, with is a fully asynchronous web server of sorts. It's not a queuing system, but I mention it because of the asynchronous behavior. RabbitMQ was perfect for our current solution, however.
What other advice do I have?
The RabbitMQ documentation is pretty good. I'd only suggest making sure to read through it for the implementation language of your choice first.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Data Engineer at Broadridge Financial Solutions
A good warehouser, compressor, and an in house ETL.
What is most valuable?
I've found that the database warehouse, data compression, and ETL to be the most valuable features for us.
How has it helped my organization?
Loading batch data has really improved the efficiency of our organization.
What needs improvement?
I'd like so see better scaling, better performance from in-memory databases, and a higher compression rate. We have been facing some performance issue when doing batch loading with optimizer the scaling does works fine. They are working on having optimization techniques which made me write room for improvement.
For how long have I used the solution?
I've used it for over two years. I have been working very closely with the EMC folks.
What was my experience with deployment of the solution?
Yes, at times, but it depends on your modeling and data retrieval.
What do I think about the stability of the solution?
It's been stable for us.
What do I think about the scalability of the solution?
Its scalability needs to be improved.
How are customer service and technical support?
I would rate technical support as good and there is not much technical expertise at the start of the SR.
Which solution did I use previously and why did I switch?
We tried other MPP’s.
How was the initial setup?
It was complex, but there was a change in the setup.
What about the implementation team?
We got support from the vendor at the start.
What other advice do I have?
If you want to implement this product, you would need to scale your product well before trying to implement.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Development Lead - Java/Hybris with 10,001+ employees
Some of the valuable features are queues, topics, and native cloud app support.
Pros and Cons
- "Simple and straightforward admin portals: Made it easy for users and worked out excellently for our requirements"
- "The solution needs improvement on performance."
How has it helped my organization?
My company runs on high availability. It is known for high accuracy in its items that are being shipped.
To do this, drivers/vendors who are shipping these items have to send their location details frequently to the server to update their current location. It all depends on accuracy.
Based on this, the end user can plan to receive shipping items on his end. We wanted a JMS tool that can create 'queues' on the fly and pass messages from one system to another.
What is most valuable?
- Queues and topics
- Native cloud app support
- Light-weight
- Easy maintenance
- Simple and straightforward admin portals: Made it easy for users and worked out excellently for our requirements
What needs improvement?
The solution needs improvement on performance.
What do I think about the stability of the solution?
It was stable enough to process our requests.
What do I think about the scalability of the solution?
We were able to operate multiple nodes and we implemented a load balancer to meet our high traffic requirements.
How are customer service and technical support?
We have never needed technical support. It was all there in the API documents provided by RabbitMQ and there are numerous blogs available on internet.
Which solution did I use previously and why did I switch?
We analyzed our requirement thoroughly and were sure that RabbitMQ was the solution for us. We didn’t look at anything else.
How was the initial setup?
It was bundled with PCF, so it was never a problem for us.
What's my experience with pricing, setup cost, and licensing?
Again, it was part of PCF bundle, so that was never a worry for us.
What other advice do I have?
This is a great product. It is lightweight, supports cloud native applications, is easy to implement, is easily manageable, and has excellent support. I would say, just go for it!
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Head of Data & Infrastructure at a tech services company with 51-200 employees
I value the routing control and priority messaging capabilities. I would like to see better scaling and scalability capabilities
Pros and Cons
- "Very sophisticated routing control and priority messaging capabilities"
- "The fact that a single queue can't be distributed across multiple instances/nodes is a major disadvantage."
How has it helped my organization?
We're using this as our central messaging bus. It drives our micro-service architecture.
What is most valuable?
- Great management UI: The best in its class of messaging products
- Very sophisticated routing control and priority messaging capabilities
What needs improvement?
- The product should have much better scaling and scalability capabilities. Currently, they're really falling behind some of the competitors such as Kafka and NSQ.
- The installation of the HA version and clustering mechanism should be made much easier.
- The fact that a single queue can't be distributed across multiple instances/nodes is a major disadvantage.
What do I think about the stability of the solution?
We had multiple issues with stability. The product tends to be highly unstable when under heavy loads.
What do I think about the scalability of the solution?
We had multiple issues with scalability. The product's scalability is rather problematic. It tends to be very complex to maintain with various sharding and high availability options.
How are customer service and technical support?
I have never used technical support.
Which solution did I use previously and why did I switch?
We tested some earlier version of Apache Kafka, but it wasn't stable enough at the time. At the moment, we're considering switching back to Apache Kafka.
How was the initial setup?
The non-sharded/clustered setup is very easy and straightforward. The clustered solution setup is much more complicated.
What's my experience with pricing, setup cost, and licensing?
We have only used the open source version.
Which other solutions did I evaluate?
We evaluated Apache Kafka, NSQ, and ActiveMQ.
What other advice do I have?
Check the scaling issues. If scale is not an issue and you're just looking for a stable messaging queue, I would highly suggest it.
If scale is an issue, I would suggest using Apache Kafka.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free VMware Tanzu Data Solutions Report and get advice and tips from experienced pros
sharing their opinions.
Updated: October 2024
Product Categories
Data Warehouse Database Development and Management Relational Databases Tools Message Queue (MQ) SoftwarePopular Comparisons
Oracle Exadata
Apache Hadoop
SAP BW4HANA
IBM Netezza Performance Server
Oracle Database Appliance
Microsoft Parallel Data Warehouse
Buyer's Guide
Download our free VMware Tanzu Data Solutions Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Looking for advice on how to migrate from Oracle Exadata to VMware Tanzu Greenplum
- What is the biggest difference between ActiveMQ and RabbitMQ?
- What is the biggest difference between IBM MQ and RabbitMQ?
- How does IBM MQ compare with VMware RabbitMQ?
- Oracle Exadata vs. HPE Vertica vs. EMC GreenPlum vs. IBM Netezza
- When evaluating Data Warehouse solutions, what aspect do you think is the most important to look for?
- At what point does a business typically invest in building a data warehouse?
- Is a data warehouse the best option to consolidate data into one location?
- What are the main differences between Data Lake and Data Warehouse?
- Infobright vs. Exadata vs. Teradata vs. SQL Server Data Warehouse- which is most compatible with front end tools?