It allows for very quick development due to the intuitive interface. Compared to other ETL tools like Powercenter, SSIS and SAS DI Studio it excels in rapid development cycles.
Sr BI Administrator at a healthcare company with 1,001-5,000 employees
It gave ‘out-of-the-box’ widgets for reading XML and Json interfaces which would otherwise have to be build from scratch.
What is most valuable?
How has it helped my organization?
It gave ‘out-of-the-box’ widgets for reading XML and JSON interfaces which would otherwise have to be build from scratch.
What needs improvement?
PDI excels at the development part. Administration and monitoring are pretty weak and basic. But, I must say I have been spoiled with the great capabilities that Powercenter offers ‘out-of-the-box’ The Pentaho development team seems to rely very heavily on Linux/Unix for the admin part. Debugging could be enhanced with better feed-back.
For how long have I used the solution?
We used PDI 4.3 in a pilot against SSIS during 2013 for a couple of months. In 2014 I have the 4.4 version on a daily basis within a production environment for exactly one year. We also looked into the commercial front-end solution and found this to be too much of a collection of loosely connected applications
Buyer's Guide
Pentaho Data Integration and Analytics
November 2024
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,406 professionals have used our research since 2012.
What was my experience with deployment of the solution?
There have been no deployment issues.
What do I think about the stability of the solution?
Stability is a bit of an issue. The GUI quite often ‘freezes’ and the is no alternative to killing the session. Very frequent saving is in order
What do I think about the scalability of the solution?
There have been no issues with scalability.
How are customer service and support?
The community site is pretty brilliant. Every technical component is handled on its own Wiki page. You can even look into the scrum backlog of the dev. team. Absolutely amazing.
Which solution did I use previously and why did I switch?
Heavy ETL solutions were simply too expensive and the SSIS alternative is simply too hidious to consider. It took at least three times as much time to develop the same ETL proces with SSIS as compared to Pentaho. (And having to deal with the abject Microsoft ‘debugging’.
How was the initial setup?
Incredibily easy. Just unpack, make sure you got the right drivers installed, and beware of other Java applications running.
What about the implementation team?
We simply did everything ourselves, with a little aid from the community.
What other advice do I have?
Make sure Pentaho solutions are still available as they were prior to the commercial take-over. Administration is not the best developed component . The ETL is brilliant. Make sure that the admin part is covered.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Consultant at a comms service provider with 11-50 employees
Simple to install and simple to use and helps us mine, clean, and arrange terabytes of data
Pros and Cons
- "It's very simple compared to other products out there."
- "One thing that I don't like, just a little, is the backward compatibility."
What is most valuable?
It's very simple compared to other products out there.
How has it helped my organization?
We use Pentaho for data integration, but also PI to implement data mining. That has improved the intelligence behind the data. So, we are able to provide our customer with the ability to understand their data. Our customer produces terabytes of data, so arranging the data, cleaning the data, on data integration, aided our customer to understand the data to improve their business.
What needs improvement?
One thing that I don't like, just a little, is the backward compatibility. I used Pentaho from version 4, and version 6 does not work with the whole ETL design. So backward compatibility is a problem.
For how long have I used the solution?
I have worked with this product for seven years.
What do I think about the stability of the solution?
It's a stable product. In fact, contains some mocks, where you can write your own Java software, and do an ETL, specific for your needs.
How is customer service and technical support?
The support is very fast, but there are also a lot of forums to address problems, so you can find the solution to your issue easily. There is also the possibility to buy support, and when we bought support they resolved our problem in 24 hours.
How was the initial setup?
It was very, very simple. I copied the integration folder, started the tool to design the ETL, and it worked. Time was required to design the ETL, just to understand how each block works. So, when you understand how each block works, you need spend no more time to use the product.
Which other solutions did I evaluate?
Before using Pentaho, I analyzed other products to understand what is the best ETL product. I tested Talend and Oracle Data Integrator. Oracle Data Integrator is a little bit more difficult to understand, how it works.
So, I preferred Pentaho Data Integration because you just have to drag and drop the block, draw a line to connect the block, write the query, and connect to the DB. There's nothing else you need to do. For Oracle Data Integrator, and also for Talend, you spend more time installing the product. By contrast, with Pentaho, you just have to copy the folder, launch the product, and then you just need the Java machine and it works.
What other advice do I have?
When you start to use this product, if you have just a little experience and know about ETL, you will have to spend little time to learn the it. The product is very, very simple to understand. You can build functionality by yourself.
Anyone thinking about an ETL product, if they want high productivity on data cleaning and data movement, Pentaho Data Integration, in my opinion, is the best tool.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Pentaho Data Integration and Analytics
November 2024
Learn what your peers think about Pentaho Data Integration and Analytics. Get advice and tips from experienced pros sharing their opinions. Updated: November 2024.
816,406 professionals have used our research since 2012.
Project Manager - Business Intelligence at www.datademy.es
It has improved our data integration capabilities
Pros and Cons
- "It has improved our data integration capabilities."
- "Provides a good open source option."
- "There is not a data quality or MDM solution in the Pentaho DI suite."
- "I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse."
- "I work with the Community Edition, therefore I do not have support. There was an issue that I could not resolve with community support."
How has it helped my organization?
Developed ETL processes to load a data warehouse. Has improved our data integration capabilities.
What is most valuable?
- Easy to use
- Development of the product
- A lot of predefined steps
- Good open source option
What needs improvement?
There is not a data quality or MDM solution in the Pentaho DI suite.
For how long have I used the solution?
Three to five years.
What do I think about the stability of the solution?
No issues.
What do I think about the scalability of the solution?
I could not connect to our Hadoop environment in an easy and flexible way, and it was important to scale our data warehouse.
How are customer service and technical support?
I work with the Community Edition, therefore I do not have support. There was an issue that I could not resolve with community support.
Which solution did I use previously and why did I switch?
I switched from our previous solution for cost reasons.
How was the initial setup?
It was not complex.
What's my experience with pricing, setup cost, and licensing?
There is a good open source option (Community Edition).
Which other solutions did I evaluate?
No.
What other advice do I have?
There is a lack of support if you work with the Community Edition.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Datawarehouse Administrator at a tech services company with 501-1,000 employees
We have been able to expose data services through the use of CDA relying on the same database as the reporting tools.
What is most valuable?
Its ability of blending data and the dashboarding with C*TOOLS for creating responsive single page apps.
How has it helped my organization?
We have been able to expose data services through the use of CDA relying on the same database as the reporting tools, thus avoiding inconsistencies among the data shown by reports and data acquired by external systems.
What needs improvement?
The User Console, aka workspace, and the development of dashboards. They work but they require some programmer skills. This means a continous application management on behalf of IT dept.
For how long have I used the solution?
I've used it for six years.
What was my experience with deployment of the solution?
There were issues, but they were solved with help from tech support.
What do I think about the stability of the solution?
There were issues, but they were solved with help from tech support.
What do I think about the scalability of the solution?
There were issues, but they were solved with help from tech support.
How are customer service and technical support?
It depends, as it takes usually a long time, and some answers are just a way to acquire time and the commitment seems poor. However, when you finally get to an engineer your are likely to have your problem solved in a few days.
Which solution did I use previously and why did I switch?
We used Microstrategy, Cognos, and Business Objects. The pricing was the key driver, but also the open source licensing which made us think we would have been able to develop on our own improvements. This didn't happen because primarily of the few resources we effectively put on development.
How was the initial setup?
It's complex because of the lack of documentation and the absence of an installer for Linux.
What about the implementation team?
We did it in-house one, and we had to hire some developers for some months with Java skills.
What other advice do I have?
Have a vision, and do not let yourself be guided by the technology.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Project Lead at a tech services company with 10,001+ employees
The best benefit of the product is that it is easy to use and to understand.
Valuable Features:
The best benefit of the product is that it is easy to use and to understand.
Improvements to My Organization:
We have a huge amount of data that needs to be cleaned and made more valuable for our organization. This Data Integration helps us to achieve that goal.
Room for Improvement:
I have used multiple versions of this product. The initial version we were on was v3.2 and we were had multiple issues, but currently don't find any issues as a blocker. In general, it would be good if we could get better performance from this product.
Deployment Issues:
We haven't had any issues with deployment.
Stability Issues:
We haven't had any issues with stability except for those described in the Areas for Improvement.
Scalability Issues:
We haven't had any issues with scalability.
Other Advice:
There are other products out there, but I feel that this is the best one.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
DWH Specialist at a healthcare company with 1,001-5,000 employees
It is extremely flexible, it allows you to use variables/parameters for just about everything.
Valuable Features:
It is extremely flexible, it allows you to use variables/parameters for just about everything.
Improvements to My Organization:
It enables us to automate our reporting and ETL to a very high extent.
Room for Improvement:
The product itself is great, the biggest downside in my opinion is that it is hard to find (hire) people with expertise. Our experience with Pentaho software is that few people have the required expertise. Hiring additional resources for projects can be tough.
Our solution is that we tend to train our own people, it’s definitely not hard to learn, basically anyone with SQL knowledge and experience in another tool can learn using Pentaho Data Integration very easily, but you might end up training them yourselves.
Deployment Issues:
We had no issues with the deployment.
Stability Issues:
There was no issues with the stability.
Scalability Issues:
We had no issues scaling it for our needs.
Other Advice:
Train your own people!
Disclosure: I am a real user, and this review is based on my own experience and opinions.
BI developer - (Jaspersoft/Pentaho/Pentaho C-Tools/Kettle/Talend/Data warehouse) at a tech services company with 501-1,000 employees
You can get ETL, reporting, analysis, and analytics in a single shop.
Valuable Features:
- Best in performance in both hosted and local environments
- Best open source warehouse solution using the Kimball method
- Best Big Data discovery components and BI
- Simple and easy to understand and work with
- Complete cost effective solutions
- Best support in forums
- Best visualizations in the market - Protovis & D3
- Best custom interactivity features
- Best product for embedded BI
- Best for mobile responsive technology integrated, i.e. bootstrap
- Best support in forums
- Best documentation - Open API's
Improvements to My Organization:
- It's reduced our costs
- With self-service we can save time
- Open plug-ins contributors
Room for Improvement:
- Searching repository for reports or dashboards
- Repository UI
- Loading of percentage reports and dashboards
Other Advice:
It has a fancy look, the best visualization libraries and is open source. You can get ETL, reporting, analysis, and analytics in a single shop. Small, mid sized and enterprises such as CA have been implementing Pentaho.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Research Assistant at a university with 1,001-5,000 employees
The user-defined class operator is currently very valuable to me.
Valuable Features:
I would say that user-defined class operator is currently very valuable to me. Other than that native connectivity to hadoop (MapR), analytical databases and enterprise systems are really important to me these days.
Improvements to My Organization:
I am a researcher in the field of data integration, and I am using this tool as a sandbox. I would say, because it is open source and high availability of forums and support has made my work really easy. Also, the reporting and analysis functionality provided gives me more freedom to test my test cases and results.
Room for Improvement:
I would like to have more languages/scripts supported in user-defined classes. Right now the options are very limited. I know, if I want to do core programming I can always import my classes/jars into it, but it would be really nice to have more functionality in terms of programming language and support in UD classes/operator. Besides that, different parallel algorithms/skeletons would be great. For example, it could suggest which parallel algorithm I should use on a particular operator or a set of operators. It would be really cool to have such a functionality.
Other Advice:
If you are looking to integrate unstructured or semi-structured datasets with some parallelization, choose this tool. Parallelization supported by Pentaho Data Integration is a functionality that is really nice to have . You can choose which activities you want to parallelize and that's it. You do not have to write parallel code or something, as it does this job for you, which is awesome for a not so good programmer such as myself.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros
sharing their opinions.
Updated: November 2024
Product Categories
Data IntegrationPopular Comparisons
Informatica Intelligent Data Management Cloud (IDMC)
Azure Data Factory
Informatica PowerCenter
Oracle Data Integrator (ODI)
Talend Open Studio
IBM InfoSphere DataStage
Oracle GoldenGate
SAP Data Services
Alteryx Designer
Spring Cloud Data Flow
Buyer's Guide
Download our free Pentaho Data Integration and Analytics Report and get advice and tips from experienced pros
sharing their opinions.
Quick Links
Learn More: Questions:
- Which ETL tool would you recommend to populate data from OLTP to OLAP?
- What do you think can be improved with Hitachi Lumada Data Integrations?
- What do you use Hitachi Lumada Data Integrations for most frequently?
- Is using Hitachi Lumada Data Integrations cost-effective? Did this solution save money for your company compared to other products?
- When evaluating Data Integration, what aspect do you think is the most important to look for?
- Microsoft SSIS vs. Informatica PowerCenter - which solution has better features?
- What are the best on-prem ETL tools?
- Which integration solution is best for a company that wants to integrate systems between sales, marketing, and project development operations systems?
- Experiences with Oracle GoldenGate vs. Oracle Data Integrator?
- What are the must-have features for a Data integration system?