What is our primary use case?
My work primarily revolves around data migration and data integration for different products. I have used them in different companies, but for most of our use cases, we use it to integrate all the data that needs to flow into our product. Also, we can have outbound from our product when we need to send to different, various integration points. We use this product extensively to build ETLs for those use cases.
We are developing ETLs for the inbound data into the product as well as outbound to various integration points. Also, we have a number of core ETLs written on this platform to enhance our product.
We have two different modes that we offer: one is on-premises and the other is on the cloud. On the cloud, we have an EC2 instance on AWS, then we have installed that EC2 instance and we call it using the ETL server. We also have another server for the application where the product is installed.
We use version 8.3 in the production environment, but in the dev environment, we use version 9 and onwards.
How has it helped my organization?
We have been able to reduce the effort required to build sophisticated ETLs. Also, we now are in the migration phase from an on-prem product to a cloud-native application.
We use Lumada’s ability to develop and deploy data pipeline templates once and reuse them. This is very important. When the entire pipeline is automated, we do not have any issues in respect to deployment of code or with code working in one environment but not working in another environment. We have saved a lot of time and effort from that perspective because it is easy to build ETL pipelines.
What is most valuable?
The metadata injection feature is the most valuable because we have used it extensively to build frameworks, where we have used it to dynamically generate code based on different configurations. If you want to make a change at all, you do not need to touch the actual code. You just need to make some configuration changes and the framework will dynamically generate code for that as per your configuration.
We have a UI where we can create our ETL pipelines as needed, which is a key advantage for us. This is very important because it reduces the time to develop for a given project. When you need to build the whole thing using code, you need to do multiple rounds of testing. Therefore, it helps us to save some effort on the QA side.
Hitachi Vantara's roadmap has a pretty good list of features that they have been releasing with every new version. For instance, in version 9, they have included metadata injection for some of the steps. The most important elements of this roadmap to our organization’s strategy are the data-driven approach that this product is taking and the fact that we have a very low-code platform. Combining these two is what gives us the flexibility to utilize this software to enhance our product.
What needs improvement?
It could be better integrated with programming languages, like Python and R. Right now, if I want to run a Python code on one of my ETLs, it is a bit difficult to do. It would be great if we have some modules where we could code directly in a Python language. We don't really have a way to run Python code natively.
For how long have I used the solution?
I have been working with this tool for five to six years.
What do I think about the stability of the solution?
They are making it a lot more stable. Earlier, stability used to be an issue when it was not with Hitachi. Now, we don't see those kinds of issues or bugs within the platform because it has become far more stable. Also, we see a lot of new big data features, such as connecting to the cloud.
What do I think about the scalability of the solution?
Lumada is flexible to deploy in any environment, whether on-premises or the cloud, which is very important. When we are processing data in batches on certain days, e.g., at the end of the week or month, we might have more data and need more processing power or RAM. However, most times, there might be very minimal usage of that CPU power. In that way, the solution has helped us to dynamically scale up, then scale down when we see that we have more data that we need to process.
The scalability is another key advantage of this product versus some of the others in the market since we can tweak and modify a number of parameters. We are really impressed with the scalability.
We have close to 80 people who are using this product actively. Their roles go all the way from junior developers to support engineers. We also have people who have very little coding knowledge and are more into the management side of things utilizing this tool.
How are customer service and support?
I haven't been part of any technical support discussions with Hitachi.
Which solution did I use previously and why did I switch?
We are very satisfied with our decision to purchase Hitachi's product. Previously, we were using another ETL service that had a number of limitations. It was not a modern ETL service at all. For anything, we had to rely on another third-party software. Then, with Hitachi Lumada, we don't have to do that. In that way, we are really satisfied with the orchestration or cloud-native steps that they offer. We are really happy on those fronts.
We were using something called Actian Services, which had less features and it ended up costing more than the enterprise edition of Pentaho.
We could not do a number of things on Actian. For instance, we were unable to call other APIs or connect to an S3 bucket. It was not a very modern solution. Whereas, with Pentaho, we could do all these things as well as have great marketplaces where we could find various modules and third-party plugins. Those features were simply not there in the other tool.
How was the initial setup?
The initial setup was pretty straightforward.
What about the implementation team?
We did not have any issues configuring it, even in my local machine. For the enterprise edition, we have a separate infrastructure team doing that. However, for at least the community edition, the deployment is pretty straightforward.
What was our ROI?
We have seen at least 30% savings in terms of effort. That has helped us to price our service and products more aggressively in the market, helping us to win more clients.
It has reduced our ETL development time. Per project, it has reduced by around 30% to 35%.
We can price more aggressively. We were actually able to win projects because we had great reusability of ETLs. A code that was used for one client can be reused with very minimal changes. We didn't have any upfront cost for kick-starting projects using the Community edition. It is only the Enterprise edition that has a cost.
What's my experience with pricing, setup cost, and licensing?
For most development tasks, the Enterprise edition should be sufficient. It depends on the type of support that you require for your production environment.
Which other solutions did I evaluate?
We did evaluate SSIS since our database is based on Microsoft SQL server. SSIS comes with any purchase of an SQL Server license. However, even with SSIS, there were some limitations. For example, if you want to build a package and reuse it, SSIS doesn't provide the same kinds of abilities that Pentaho does. The amount of reusability reduces when we try to build the same thing using SSIS. Whereas, in Pentaho, we could literally reuse the same code by using some of its features.
SSIS comes with the SQL Server and is easier to maintain, given that there are far more people who would have knowledge of SSIS. However, if I want to do a PCP encryption or make an API connection, it is difficult. To create a reusable package is not that easy, which would be the con for SSIS.
What other advice do I have?
The query performance depends on the database. It is more likely to be good if you have a good database server with all the indexes and bells and whistles of a database. However, from a data integration tool perspective, I am not seeing any issues with respect to query performance.
We do not build visualization features that much with Hitachi. For the reporting purposes, we have been using one of the tools from the product, then prepare the data accordingly.
We use this for all the projects that we are currently running. Going forward, we will be sticking only to using this ETL tool.
We haven't had any roadblocks using Lumada Data Integration.
On a scale of one to 10, I would recommend Hitachi Vantara to a friend or colleague as a nine.
If you need to build ETLs quickly in a low-code environment, where you don't want to spend a lot of time on the development side of things but it is a little difficult to find resources, then train them in this product. It is always worth that effort because it ends up saving a lot of time and resources on the development side of projects.
Overall, I would rate the product as a nine out of 10.
Which deployment model are you using for this solution?
Hybrid Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Yes the integration tool should be made available as Professional or Community / Standard / Enterprise Editions and Pricing should be made accordingly on the industry by industry basis or cases by case. And also there should be Transparency in the pricing and availability of community edition as the case was earlier when Pentaho management realeased it into market.