What is our primary use case?
We started using Pentaho for two purposes:
- As an ETL tool to bring data in.
- As an analytics tool.
As our solution progressed, we dropped the ETL piece of Pentaho. We didn't end up using it. What remains in our product today is the analytics tool.
We do a lot of simulations on our data with Pentaho reports. We use Pentaho's reporting capabilities to tell us how contracts need to be negotiated for optimal results by using the analytics tool within Pentaho.
How has it helped my organization?
This was an OEM solution for our product. The way it has improved our product is by giving our users the ability to do ad hoc reports, which is very important to our users. We can do predictive analysis on trends coming in for contracts, which is what our product does. The product helps users decide which way to go based on the predictive analysis done by Pentaho. Pentaho is not doing predictions, but reporting on the predictions that our product is doing. This is a big part of our product.
What is most valuable?
There is an end-to-end flow, where a user can say, "I am looking at this field and want to slice and dice my data based on these parameters." That flexibility is provided by Pentaho. This minimal manual coding is important to us.
What needs improvement?
The performance could be improved. If they could have analytics perform well on large volumes, that would be a big deal for our products.
For how long have I used the solution?
I have been using it for eight years.
What do I think about the stability of the solution?
We are on-prem. Once the product was installed and up and running, I haven't had issues with the product going down or not being responsive.
We have one technical lead who is responsible for making sure that we keep upgrading the solution so we are not on a version that is not supported anymore. In general, it is low maintenance.
What do I think about the scalability of the solution?
The only complaint that I have with Pentaho has been with scaling. As our data grew, we tested it with millions of records. When we started to implement it, we had clients that went from 80 million to 100 million. I think scale did present a problem with the clients. I know that Pentaho talks about being able to manage big data, which is much more data than what we have. I don't know if it was our architecture versus the product limitations, but we did have issues with scaling.
Our product doesn't deal with big data at large. There are probably 17 million records. With those 17 million records, it performs well when it has been internally cached within Pentaho. However, if you are loading the dataset or querying it for the first time, then it does take awhile. Once it has been cached in Pentaho, the subsequent queries are reasonably fast.
How are customer service and support?
We haven't had a lot of functional issues. We had performance issues, especially early on, as we were trying to spin up this product. The response time from the support group has been a three on a scale of one to five.
We had trouble with the performance and had their engineers come in. We shared our troubles and problems, then those engineers had brainstorming sessions. Their ability to solve problems was really good and I would rate that as four out of five.
A lot of the problems were with the performance and scale of data that we had. It could have been that we didn't have a lot of upfront clean architecture. With the brainstorming sessions, we tried giving two sets of reports to users:
- One was more summary level, which was quick, and that is what 80% of our clients use.
- For 20% of our clients, we provided detailed reports that do take awhile. However, you are then not impacting performance for 80% of your clients.
This was a good solution or compromise that we reached from both a business and technology perspective.
Now, I feel like the product is doing well. It is almost like their team helped us with rearchitecting and building product expectations.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, we used to have something called QlikView, which is almost obsolete now. We had a lot of trouble with QlikView. Anytime processing was done, it would take a long time for those processed results to be loaded into QlikView's memory. This meant that there was a lot of time spent once an operation was done. Before users could see results or reports, it would take a couple of hours. We didn't want that lag.
Pentaho offered an option not to have that lag. It did not have its own in-memory database, where everything had to be loaded. That was one of the big reasons why we wanted to switch away from QlikView, and Pentaho fit that need.
How was the initial setup?
I would say the deployment/implementation process was straightforward enough for both data ingestion and analytics.
When we started with the data ingestion, we went with something called Spoon. Then we realized, while it was a Pentaho product, Spoon was open source. We had integrated with the open source version of it, but later found that it didn't work for commercialization.
For us to integrate Pentaho and get it working, it took a couple of months because we needed to figure out authentication with Pentaho. So, learning and deployment within our environment took a couple of months. This includes the actual implementation and figuring out how to do what we wanted to do.
Because this is a licensed product, the deployment for the client was a small part of the product's deployment. So, on an individual client basis, the deployment is easy and a small piece.
It gives us the flexibility to deploy it in any environment, which is important to us.
If we went to the cloud version of Pentaho, that would be a big maintenance relief. We wouldn't have to worry about getting the latest version, installing it, and sending it out to our clients.
What about the implementation team?
For the deployment, we had people come in from Pentaho for a week or two. They were there with us through the process.
Which other solutions did I evaluate?
We looked at Tableau, Pentaho and an IBM solution. In the absence of Pentaho, we would have gone with either Tableau or building our own custom solution. When we were figuring out what third-party tool to use, we did an analysis and a bunch of other tools were compared. Ultimately, we went with Pentaho because it did have a wide variety of features and functionalities within its reports. Though I wasn't involved, there was a cost analysis done and Pentaho did favorably in terms of cost.
For the product that we use Pentaho for, I think we're happy with their decision. There are a few other products in our product suite. Those products ended up using Tableau. I know that there have been discussions about considering Tableau over Pentaho in the future.
What other advice do I have?
Engage Pentaho's architects early on, so you know what data architecture works best with the product. We built our database and structures, then had performance issues. However, it was too late when we brought in the Pentaho architects, because our data structure was out in the field with multiple clients. Therefore, I think engaging them early on in the data architecture process would be wise.
I am not very familiar with Hitachi's roadmap and what is coming up for them. I know that they are good with sending out newsletters and keeping their customers in the know, but unfortunately, I am unaware of their roadmap.
I feel like this product is doing well. There haven't been complaints and things are moving along. I would rate it as seven out of 10.
Which deployment model are you using for this solution?
On-premises
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.