Bi Architect at a healthcare company with 10,001+ employees
Real User
Top 20
2024-12-04T18:03:32Z
Dec 4, 2024
They can provide better support for non-IBM tools when it comes to the target. Specifically, with Snowflake, there is no push-down optimization, which is a drawback when using DataStage.
Currently, the solution does not support cloud migration. We cannot connect to cloud tools using IBM InfoSphere DataStage. This is an area where improvement is needed.
Arquitecto Industrial IoT at a consultancy with 10,001+ employees
Real User
Top 10
2024-02-21T19:38:00Z
Feb 21, 2024
Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data.
Manager - Business Technology Solutions at a consultancy with 1,001-5,000 employees
Real User
Top 5
2024-01-23T13:00:11Z
Jan 23, 2024
The product must improve logging. It must also improve the navigation guide. When there is an issue, the product must give us insight into why a particular task failed. The troubleshooting guide is very bad. There is no detailed documentation, and the troubleshooting must be done manually. It is a time-consuming task. We must spend much time finding the root cause of a particular task or execution failure. It is very difficult to find an expert in DataStage.
So, there are some features that are missing. If I compare DataStage to Talend, Talend allows you to write custom code in Java or use these tools in your applications as well if you are building a job application. But in DataStage, it does not allow you to write custom code for any component. Moreover, Talend allows you to extract Java code and call it in your APIs or applications, DataStage does not have this feature. In future releases, DataStage could benefit from the ability to save metadata into a database. So, if the database crashes or you lose the data in the database, you could recover it. Unlike files, which are harder to manage.
I don't know if it's just a problem with me, but the issue I see is that when we connect to the server from the client, especially when you're going to run a job or something, the whole connection is really slow. It takes a lot of time to actually trigger your job and then go into the logs and other stuff. So all of this is really time-consuming.
There can be data quality issues sometimes. It might not be the application. It may be a human error or an issue with the users or developers as well. The initial setup can be complex.
A lot about the solution could be improved. I'd like to be able to do more with the data and metadata, including copy and pasting, et cetera. It has become easier with the cloud, however. I'd like to have the ability to customize code.
We'd like better integration with source control and error and diagnostic information. The error messaging needs to be improved. The solution is a bit complicated.
Consultant - Data Engineering at South Asian Technologies
Reseller
2022-08-30T10:49:18Z
Aug 30, 2022
Their web interface is good but the on-prem sites are outdated. The solution could also be improved if they could integrate the data pipeline scheduling part of their interface.
What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well.
As a product, it needs to be more stable. It's a legacy product, so even though it's high-performing, it's not very stable compared to other products like Informatica or Talend. The UI also looks dated. In the future, I would like to see more integration with cloud technologies. Technical support could be improved.
From an actual point of view, we think, that the supporting functionality for developers could be better. We would like to see e.g. functionality for automated mapping of columns in SORT Stages to the output or a good GUI (we know, that IBM is working on this, we would love to see this in a stable version).
We also would like to see even better integration of DB2 (e.g. support of all data types, as e.g. timestamp is lacking).Â
Support of merge, ingest and so on could be better.
From a practice point of view, solutions such as IBM InfoSphere DataStage and Oracle Data Integrator are losing ground, whereas open-source solutions are becoming increasingly powerful. For example, we are currently working hard on several examples, and in a few years, open-source solutions will take the lead in the market. It will be used by large enterprises. Clients are looking for open-source solutions more and more. It would be useful to provide support for Python, R, and Java.
Head of IT Integration & Finance Transformation at a financial services firm with 5,001-10,000 employees
Real User
2021-07-20T11:36:40Z
Jul 20, 2021
The solution is currently lacking virtualization ability. If they were to include it, it could be a good evolution on this framework. I'd like to see an improvement in support and a more customer friendly and knowledgeable support staff.
Systems Integration Associate Director at NTT DATA
Real User
2020-11-25T13:05:17Z
Nov 25, 2020
The interface needs improvement. The interface in Informatica is easier than in DataStage. The licensing can be improved. Many companies are moving away from DataStage because it is expensive. The biggest issue that is unclear is how are they integrating into DevOps when they are binary files. We would like to see DataStage integrated with DevOps so that a pipeline can be created for auto-deployment. Right now we are all doing it manually.
Enterprise and Information Architect at a tech consulting company with self employed
Real User
2019-12-11T05:40:00Z
Dec 11, 2019
I think that performance monitoring could be improved. I know that my colleagues don't give good monitoring. I'm not sure if it's because of the product or because they don't do it normally, but performance monitoring is an issue. I also believe integration with the cloud is not so clear. It's typically a heavy system that people install on-premise. You can install it in the cloud, but it's not so straightforward. You don't find a lot of information unless you go to the IBM cloud. I think IBM is behind in cloud strategy, we would like to put it in the cloud, but there isn't much information about that. There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for.
Senior Data Warehouse Developer at a computer software company with 5,001-10,000 employees
MSP
2019-12-09T10:58:00Z
Dec 9, 2019
The mod options should be simplified. Some options on DataStage aren't working properly. The solution needs to lower its price. The template mapping could be easier. The solution should allow for compression of data.
Technical Partner Manager at a tech services company with 1,001-5,000 employees
Real User
2019-12-05T06:53:00Z
Dec 5, 2019
The interface needs improvement. It is really too technical. That is the main problem. In the next release, they could offer more connectors with the new database, especially cloud databases.
The price would be the first thing I would want to change. Reduced cost would allow more customers to choose the product. It's quite expensive in relation to the cost of other similar solutions. I think it would also be helpful if the product was more adaptable to other platforms and vendors. I would also like to see an improvement in support.
The previous project was based on Microsoft SQL. It moved huge amounts of data from different data sources and DataStage to a middle stage, then moved it to Netezza. This created a bottleneck in the solution. We are trying to streamline it and create ETL processes. These will take data exactly from the data sources and move them to Netezza without using of a middle database. The volume of data is quite detailed. We are talking about records in the tens to hundreds of millions. We would be happy to see in next versions the ability to return several parameters from jobs. Now, jobs can return just one parameter. If they could return several parameters, that would be great. We would be happy if the IBM could give us more tolerance for bad networks or VPN channels, as this happens from time to time. It would be great if we could use more than one SQL operator in the Source DB connector stage. Currently, in the target DB connection stage, we can use several SQL operators, but in the Source DB connector stage we can use only one. It would be better if we could use several. Data Vault is become more popular. It would be great if it appeared in the newest versions. I would like them to have more database procedures.
Technical Lead at a tech services company with 5,001-10,000 employees
Real User
2019-07-31T05:52:00Z
Jul 31, 2019
I really like this tool, but the administration should be on the same client application because a lot of administration features are not on the client-side, and they usually need to have administrative access. It's quite complicated to force IT, teams, to have separate administrative access from the developers. The platform also needs more stability. It caches a lot. It crashes on the application servers that the host allows on the platform. The solution needs better online tools for data, or for sourcing data on the internet. They have InfoSphere exchange but it's not as useful for DataStage.
The features that could be better starts with the user interface. It has been getting better in the last releases and in the past few years, and I guess that they will continue to make progress on this front. But even with the improvements that they have made, it could be even better now, and really should be. I think it's a little bit difficult to use because of the interface. Being user-friendly is important for any product and they need to make this adjustment. In addition to improvements in the base user interface, I would say it would be good to incorporate more interface options for cloud-based systems.
The documentation and in-application help for this solution need to be improved, especially for new features. By comparison, in Talend, there is help available for all of the features. One of my clients has a problem using this solution with MongoDB. In the next release of this solution, I would like to see the ability to copy and paste schemas. It would be very good because as it is now, you have to save the schema to a repository and then re-load it. It can be done in Talend, but in DataStage, it is not as good.
IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.
The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical...
The solution can be a bit more user-friendly, similar to Informatica. I would like the solution to have some basic streaming functionality added.
They can provide better support for non-IBM tools when it comes to the target. Specifically, with Snowflake, there is no push-down optimization, which is a drawback when using DataStage.
Currently, the solution does not support cloud migration. We cannot connect to cloud tools using IBM InfoSphere DataStage. This is an area where improvement is needed.
The deployment could be more straightforward.
Improvements for DataStage could include better integration with modern data sources like cloud solutions and documents, along with enhancing its capability to handle non-structured data.
The product must improve logging. It must also improve the navigation guide. When there is an issue, the product must give us insight into why a particular task failed. The troubleshooting guide is very bad. There is no detailed documentation, and the troubleshooting must be done manually. It is a time-consuming task. We must spend much time finding the root cause of a particular task or execution failure. It is very difficult to find an expert in DataStage.
So, there are some features that are missing. If I compare DataStage to Talend, Talend allows you to write custom code in Java or use these tools in your applications as well if you are building a job application. But in DataStage, it does not allow you to write custom code for any component. Moreover, Talend allows you to extract Java code and call it in your APIs or applications, DataStage does not have this feature. In future releases, DataStage could benefit from the ability to save metadata into a database. So, if the database crashes or you lose the data in the database, you could recover it. Unlike files, which are harder to manage.
There could be more customization options for the product.
It would be great if they can include some basic version of data quality checking features.
I don't know if it's just a problem with me, but the issue I see is that when we connect to the server from the client, especially when you're going to run a job or something, the whole connection is really slow. It takes a lot of time to actually trigger your job and then go into the logs and other stuff. So all of this is really time-consuming.
There can be data quality issues sometimes. It might not be the application. It may be a human error or an issue with the users or developers as well. The initial setup can be complex.
A lot about the solution could be improved. I'd like to be able to do more with the data and metadata, including copy and pasting, et cetera. It has become easier with the cloud, however. I'd like to have the ability to customize code.
We'd like better integration with source control and error and diagnostic information. The error messaging needs to be improved. The solution is a bit complicated.
Their web interface is good but the on-prem sites are outdated. The solution could also be improved if they could integrate the data pipeline scheduling part of their interface.
What needs improvement in IBM InfoSphere DataStage is its pricing. The pricing for the solution is higher than its competitors, so a lot of the clients my company has worked with prefer other tools over IBM InfoSphere DataStage because of the high price tag. Another area for improvement in the solution stems from a lot of new types of databases, for example, databases in the cloud and big data have become available, and IBM InfoSphere DataStage is working on various connectors for different data sources, but that still isn't up-to-date, meaning that some connectors are missing for modern data sources. The latest version of IBM InfoSphere DataStage also has a complex architecture, so my team faced frequent outages and that should be improved as well.
As a product, it needs to be more stable. It's a legacy product, so even though it's high-performing, it's not very stable compared to other products like Informatica or Talend. The UI also looks dated. In the future, I would like to see more integration with cloud technologies. Technical support could be improved.
From an actual point of view, we think, that the supporting functionality for developers could be better. We would like to see e.g. functionality for automated mapping of columns in SORT Stages to the output or a good GUI (we know, that IBM is working on this, we would love to see this in a stable version).
We also would like to see even better integration of DB2 (e.g. support of all data types, as e.g. timestamp is lacking).Â
Support of merge, ingest and so on could be better.
Many updates to the interface are now available including visualization. There is even a trial version on the IBM cloud.
From the description of the Data Stage service: "Preview and visualize data with options to customize and export views"
The solution should be more flexible.Â
Also, I have a point regarding the data visibility over jobs flow as I can't see what has happened at different stages.
From a practice point of view, solutions such as IBM InfoSphere DataStage and Oracle Data Integrator are losing ground, whereas open-source solutions are becoming increasingly powerful. For example, we are currently working hard on several examples, and in a few years, open-source solutions will take the lead in the market. It will be used by large enterprises. Clients are looking for open-source solutions more and more. It would be useful to provide support for Python, R, and Java.
The solution is currently lacking virtualization ability. If they were to include it, it could be a good evolution on this framework. I'd like to see an improvement in support and a more customer friendly and knowledgeable support staff.
The initial setup could be more straightforward.
The interface needs improvement. The interface in Informatica is easier than in DataStage. The licensing can be improved. Many companies are moving away from DataStage because it is expensive. The biggest issue that is unclear is how are they integrating into DevOps when they are binary files. We would like to see DataStage integrated with DevOps so that a pipeline can be created for auto-deployment. Right now we are all doing it manually.
The response time from support is slow and needs to be improved.
The product is pretty complex to set up. I think it is quite expensive. So, the set up could be simplified and the price could be brought in line.
I think that performance monitoring could be improved. I know that my colleagues don't give good monitoring. I'm not sure if it's because of the product or because they don't do it normally, but performance monitoring is an issue. I also believe integration with the cloud is not so clear. It's typically a heavy system that people install on-premise. You can install it in the cloud, but it's not so straightforward. You don't find a lot of information unless you go to the IBM cloud. I think IBM is behind in cloud strategy, we would like to put it in the cloud, but there isn't much information about that. There are three things that could improve - the cloud, monitoring and cloud integration. It's a solid product but not a modern one and of course it depends what you're looking for.
The mod options should be simplified. Some options on DataStage aren't working properly. The solution needs to lower its price. The template mapping could be easier. The solution should allow for compression of data.
The interface needs improvement. It is really too technical. That is the main problem. In the next release, they could offer more connectors with the new database, especially cloud databases.
The price would be the first thing I would want to change. Reduced cost would allow more customers to choose the product. It's quite expensive in relation to the cost of other similar solutions. I think it would also be helpful if the product was more adaptable to other platforms and vendors. I would also like to see an improvement in support.
The previous project was based on Microsoft SQL. It moved huge amounts of data from different data sources and DataStage to a middle stage, then moved it to Netezza. This created a bottleneck in the solution. We are trying to streamline it and create ETL processes. These will take data exactly from the data sources and move them to Netezza without using of a middle database. The volume of data is quite detailed. We are talking about records in the tens to hundreds of millions. We would be happy to see in next versions the ability to return several parameters from jobs. Now, jobs can return just one parameter. If they could return several parameters, that would be great. We would be happy if the IBM could give us more tolerance for bad networks or VPN channels, as this happens from time to time. It would be great if we could use more than one SQL operator in the Source DB connector stage. Currently, in the target DB connection stage, we can use several SQL operators, but in the Source DB connector stage we can use only one. It would be better if we could use several. Data Vault is become more popular. It would be great if it appeared in the newest versions. I would like them to have more database procedures.
I really like this tool, but the administration should be on the same client application because a lot of administration features are not on the client-side, and they usually need to have administrative access. It's quite complicated to force IT, teams, to have separate administrative access from the developers. The platform also needs more stability. It caches a lot. It crashes on the application servers that the host allows on the platform. The solution needs better online tools for data, or for sourcing data on the internet. They have InfoSphere exchange but it's not as useful for DataStage.
The features that could be better starts with the user interface. It has been getting better in the last releases and in the past few years, and I guess that they will continue to make progress on this front. But even with the improvements that they have made, it could be even better now, and really should be. I think it's a little bit difficult to use because of the interface. Being user-friendly is important for any product and they need to make this adjustment. In addition to improvements in the base user interface, I would say it would be good to incorporate more interface options for cloud-based systems.
The documentation and in-application help for this solution need to be improved, especially for new features. By comparison, in Talend, there is help available for all of the features. One of my clients has a problem using this solution with MongoDB. In the next release of this solution, I would like to see the ability to copy and paste schemas. It would be very good because as it is now, you have to save the schema to a repository and then re-load it. It can be done in Talend, but in DataStage, it is not as good.
The solution should be more user-friendly.