IBM InfoSphere DataStage is versatile in integrating with both data lakes and traditional enterprise data warehouse systems. It supports ETL processes such as data standardization, cleaning, conforming, and transformation. We utilize DataStage to embed SQL code from source systems, enabling efficient data transformation and calculation. For performance optimization, we employ push-down optimization techniques. In IBM InfoSphere DataStage, we leverage parallel processing for all steps and data movements, employing a push-down methodology for handling big data. This approach involves parallelizing sessions and harnessing DataStage's robust mechanisms for highly efficient and scalable data processing. If you are maintaining or have developed IBM InfoSphere DataStage, for example, you can use the scenario view to push data from the source staging area or source mechanism. We also support real-time data if you need real-time data, although we haven't changed that aspect here. We primarily use the ETA tool to push data into a specified layer in real-time. However, we handle data loads periodically, such as monthly or weekly. For instance, if a manager or end-user requires weekly or monthly reports, we can run all the necessary steps using the Semantic layer. We can adjust our model accordingly based on the customer or end-user's reporting needs, whether monthly or yearly. Overall, I rate the solution an eight out of ten.
Arquitecto Industrial IoT at a consultancy with 10,001+ employees
Real User
Top 20
2024-02-21T19:38:00Z
Feb 21, 2024
We have used IBM InfoSphere DataStage effectively for managing Big Data within our products, particularly in scenarios involving large volumes of records for ETL processes. However, we have seen that for near real-time or on-time integration tasks, DataStage may not be optimal due to its resource-intensive nature. DataStage's scalability has indeed supported our data growth, particularly for ETL tasks involving large volumes of data, enabling us to manage increased data loads effectively. The scalability of DataStage supported our data growth by allowing us to manage increased data loads effectively, primarily through optimizing the usage of the tool rather than inherent scalability features. However, we faced challenges with real-time processing as DataStage could not trigger processes based on events like emails, requiring us to schedule tasks at intervals, which limited its suitability for real-time scenarios. DataStage integrates with our existing IT infrastructure by connecting to our manufacturing processes and systems like ERP and SAP. It facilitates integration by consolidating data from various sources, enabling us to view unified information across our systems. I would recommend DataStage for data integration, especially for SQL data and ETL tasks. Overall, I would rate DataStage at a seven out of ten. While it is a robust solution for data integration and ETL tasks, there is room for improvement in adopting more modern architectures to meet evolving needs.
Manager - Business Technology Solutions at a consultancy with 1,001-5,000 employees
Real User
Top 5
2024-01-23T13:00:11Z
Jan 23, 2024
I deal with companies from the healthcare industry. The solutions are largely cloud-based. In data-rich industries like telecom or BFSI, such tools are extensively used. Healthcare also has a lot of data. I will encourage people to use the solution. It is quite an easy tool. Every stage has a help guide. It’s an extensive documentation. We can understand the purpose of a stage, how the connection has to be set up, how to set up a username and password, and whom we should contact. New users must start using the tool and explore it. They might have to invest ten days or two weeks to understand the workflows and options. It is easy to learn. My company is a partner with IBM. Overall, I rate the product a nine out of ten.
I would highly recommend this solution because of its shared-nothing architecture that it uses, the capabilities it offers, and the fact that every feature has its own use. For example, it has a Director for creating jobs, clients for monitoring and scheduling jobs, and an Administrative client for administration purposes. This is something well managed by IBM. Overall, I would rate the solution a seven out of ten. There are certain areas of improvement.
I recommend that other people who want to use it go for DataStage on the cloud. The on-prem version of the solution looks and feels old. Also it's time-consuming as well. Overall, I rate the solution a six out of ten.
I'd rate the solution seven out of ten. I haven't used it too much. I need more time with the solution. Whether another user should try it or not depends on the environment. If you already have a lot of Oracle applications, it might make sense. It can do everything any other ETL can do.
I used to be a partner with IBM. I have to reset the partnership. I would recommend the solution for on-premises setups. I would rate the solution eight out of ten.
I give the solution a seven out of ten. We have a separate platform team or support team. In case of any query, it used to be routed to this team, which was internally used to deal with the Data Stage people. I'm not a technical expert because I haven't been a developer for 12 years. This is what I understand from the feedback I've received. Informatica Power Exchange Data Integration is much better from a scalability perspective, compared to IBM InfoSphere Data Stage. Scalability, user-friendliness, and inclusion of different business rules are all important, but I think Informatica Power Exchange Data Integration gives us one step further on that.
Consultant - Data Engineering at South Asian Technologies
Reseller
2022-08-30T10:49:18Z
Aug 30, 2022
I would advise others to identify the communication between servers and the client tools correctly. If working from a client environment and connecting to the server, configuration should be done correctly, otherwise you may encounter some issues. I would rate this solution an eight out of ten.
The last version of IBM InfoSphere DataStage which I've worked with was version 11.7. I work for an IT service company that works with multiple clients on multiple projects, so close to two hundred people use IBM InfoSphere DataStage for various clients. Per project, on average, three people take care of IBM InfoSphere DataStage deployment, maintenance, and support-related activities. My advice to people looking into implementing IBM InfoSphere DataStage is that it's a very good product. A lot of similar products have come up nowadays, but this product has a pretty good reputation as it's been in the market for quite a while. I do think other products such as Talend, Informatica PowerCenter, and Informatica Data Quality are better than IBM InfoSphere DataStage. My rating for IBM InfoSphere DataStage is eight out of ten. My company has a partnership with IBM.
We have had different projects with three of four clients. The average term per project has been nine months and one year. If you are working with an open-source solution or another solution, you can implement some features by yourself. For example, in the case of Amazon, which has Amazon Lambda, you can easily write your code in Python or Java, and it will orchestrate it. You can create your features yourself easily and gives you more abilities to make your solution run quicker, eliminating the dependence from the vendor. I would rate IBM InfoSphere DataStage an eight out of ten.
Manager at a consultancy with 1,001-5,000 employees
Real User
2021-04-06T12:32:33Z
Apr 6, 2021
Informatica provides a cloud-based deployment but we only work with the on-premises version. This is a product that I can recommend. I would rate this solution a six out of ten.
Systems Integration Associate Director at NTT DATA
Real User
2020-11-25T13:05:17Z
Nov 25, 2020
I am not a developer, I have a team within our company for that. There is a cloud migration strategy going on, so they are thinking of moving to the cloud. They want a tool that is not heavy and suitable for their budget. The recommendation for using this tool would depend on the requirements. I don't have anything bad to say about this product. I would rate this solution an eight out of ten.
Managing Director at a tech services company with 11-50 employees
Real User
2020-09-16T08:18:36Z
Sep 16, 2020
My advice for anyone considering IBM Infosphere Datastage is to use a decent consulting house to help you once you get around to committing to the product. Do not assume that you will be able to go at this alone unless you have an extremely talented staff. On a scale from one to ten (where one is the worst and ten is the best), I would rate this product as a seven-out-of-ten.
Senior Data Warehouse Developer at a computer software company with 5,001-10,000 employees
MSP
2019-12-09T10:58:00Z
Dec 9, 2019
We use the on-premises deployment model. If you are comparing the solution to Informatica, this solution is much simpler. In Informatica, for example, there might be two to three ways to find a log, but with DataStage, they make it much easier. However, compared to other vendors, IBM's licensing costs are more expensive. I'd rate the solution eight out of ten.
Technical Partner Manager at a tech services company with 1,001-5,000 employees
Real User
2019-12-05T06:53:00Z
Dec 5, 2019
Try to do a lot of training before beginning the project. I would rate this solution as seven out of ten because I think there are some problems with the design of the interface. They should also develop new connectors. The solution is quite complex for people and you need a lot of training to use it.
Technical Lead at a tech services company with 5,001-10,000 employees
Real User
2019-07-31T05:52:00Z
Jul 31, 2019
The last version I interacted with was 11.3 because the later versions were cloud-based and usually our customers didn't want to use the solution on the cloud. In terms of advice, I would give to anyone trying to implement the solution is this: you to have accurate sizing. Clients always do the sizing wrong and they need more experience to get the sizing right. Setting up the environments takes sizing into account but it usually makes a lot of problems if the sizing is poor when it starts to operate. Then you have re-implement and it will require an increase in resources that will change your budget. I would rate this solution nine out of ten.
I would rate this particular product as a nine out of ten. It is very powerful and very fast, but the problems with the interface make it less than perfect. As far as other advice that I would have for other people considering this as a solution, the first and most important is to examine your needs and decide on the processes you want to build. From that, you can immediately have a better idea of the type of solution that might be best for you. Then it is a good idea to get the advice of a consultant — like us.
IT Administrator at a transportation company with 10,001+ employees
Real User
Top 20
2019-07-29T10:11:00Z
Jul 29, 2019
The advice I would give to others is to make sure they define a framework for development and for management. This could be very useful for the future of the product in the company. I would rate the entire solution eight out of ten. I really like DataStage. The product fits our requirements perfectly. We are changing the product now, however, to a cloud-based approach for DataStage.
IBM InfoSphere DataStage is a high-quality data integration tool that aims to design, develop, and run jobs that move and transform data for organizations of different sizes. The product works by integrating data across multiple systems through a high-performance parallel framework. It supports extended metadata management, enterprise connectivity, and integration of all types of data.
The solution is the data integration component of IBM InfoSphere Information Server, providing a graphical...
IBM InfoSphere DataStage is versatile in integrating with both data lakes and traditional enterprise data warehouse systems. It supports ETL processes such as data standardization, cleaning, conforming, and transformation. We utilize DataStage to embed SQL code from source systems, enabling efficient data transformation and calculation. For performance optimization, we employ push-down optimization techniques. In IBM InfoSphere DataStage, we leverage parallel processing for all steps and data movements, employing a push-down methodology for handling big data. This approach involves parallelizing sessions and harnessing DataStage's robust mechanisms for highly efficient and scalable data processing. If you are maintaining or have developed IBM InfoSphere DataStage, for example, you can use the scenario view to push data from the source staging area or source mechanism. We also support real-time data if you need real-time data, although we haven't changed that aspect here. We primarily use the ETA tool to push data into a specified layer in real-time. However, we handle data loads periodically, such as monthly or weekly. For instance, if a manager or end-user requires weekly or monthly reports, we can run all the necessary steps using the Semantic layer. We can adjust our model accordingly based on the customer or end-user's reporting needs, whether monthly or yearly. Overall, I rate the solution an eight out of ten.
We have used IBM InfoSphere DataStage effectively for managing Big Data within our products, particularly in scenarios involving large volumes of records for ETL processes. However, we have seen that for near real-time or on-time integration tasks, DataStage may not be optimal due to its resource-intensive nature. DataStage's scalability has indeed supported our data growth, particularly for ETL tasks involving large volumes of data, enabling us to manage increased data loads effectively. The scalability of DataStage supported our data growth by allowing us to manage increased data loads effectively, primarily through optimizing the usage of the tool rather than inherent scalability features. However, we faced challenges with real-time processing as DataStage could not trigger processes based on events like emails, requiring us to schedule tasks at intervals, which limited its suitability for real-time scenarios. DataStage integrates with our existing IT infrastructure by connecting to our manufacturing processes and systems like ERP and SAP. It facilitates integration by consolidating data from various sources, enabling us to view unified information across our systems. I would recommend DataStage for data integration, especially for SQL data and ETL tasks. Overall, I would rate DataStage at a seven out of ten. While it is a robust solution for data integration and ETL tasks, there is room for improvement in adopting more modern architectures to meet evolving needs.
I deal with companies from the healthcare industry. The solutions are largely cloud-based. In data-rich industries like telecom or BFSI, such tools are extensively used. Healthcare also has a lot of data. I will encourage people to use the solution. It is quite an easy tool. Every stage has a help guide. It’s an extensive documentation. We can understand the purpose of a stage, how the connection has to be set up, how to set up a username and password, and whom we should contact. New users must start using the tool and explore it. They might have to invest ten days or two weeks to understand the workflows and options. It is easy to learn. My company is a partner with IBM. Overall, I rate the product a nine out of ten.
I would highly recommend this solution because of its shared-nothing architecture that it uses, the capabilities it offers, and the fact that every feature has its own use. For example, it has a Director for creating jobs, clients for monitoring and scheduling jobs, and an Administrative client for administration purposes. This is something well managed by IBM. Overall, I would rate the solution a seven out of ten. There are certain areas of improvement.
I rate IBM InfoSphere DataStage an eight out of ten.
I would rate the product a nine out of ten. You need to get a balance between batch ETL processing and streaming.
I recommend that other people who want to use it go for DataStage on the cloud. The on-prem version of the solution looks and feels old. Also it's time-consuming as well. Overall, I rate the solution a six out of ten.
I'd rate the solution seven out of ten. I haven't used it too much. I need more time with the solution. Whether another user should try it or not depends on the environment. If you already have a lot of Oracle applications, it might make sense. It can do everything any other ETL can do.
I used to be a partner with IBM. I have to reset the partnership. I would recommend the solution for on-premises setups. I would rate the solution eight out of ten.
I give the solution a seven out of ten. We have a separate platform team or support team. In case of any query, it used to be routed to this team, which was internally used to deal with the Data Stage people. I'm not a technical expert because I haven't been a developer for 12 years. This is what I understand from the feedback I've received. Informatica Power Exchange Data Integration is much better from a scalability perspective, compared to IBM InfoSphere Data Stage. Scalability, user-friendliness, and inclusion of different business rules are all important, but I think Informatica Power Exchange Data Integration gives us one step further on that.
I'd recommend the product to others. I'd rate it a nine out of ten. We've been pleased with its capabilities overall.
I would advise others to identify the communication between servers and the client tools correctly. If working from a client environment and connecting to the server, configuration should be done correctly, otherwise you may encounter some issues. I would rate this solution an eight out of ten.
The last version of IBM InfoSphere DataStage which I've worked with was version 11.7. I work for an IT service company that works with multiple clients on multiple projects, so close to two hundred people use IBM InfoSphere DataStage for various clients. Per project, on average, three people take care of IBM InfoSphere DataStage deployment, maintenance, and support-related activities. My advice to people looking into implementing IBM InfoSphere DataStage is that it's a very good product. A lot of similar products have come up nowadays, but this product has a pretty good reputation as it's been in the market for quite a while. I do think other products such as Talend, Informatica PowerCenter, and Informatica Data Quality are better than IBM InfoSphere DataStage. My rating for IBM InfoSphere DataStage is eight out of ten. My company has a partnership with IBM.
I would rate this solution 8 out of 10.
We have had different projects with three of four clients. The average term per project has been nine months and one year. If you are working with an open-source solution or another solution, you can implement some features by yourself. For example, in the case of Amazon, which has Amazon Lambda, you can easily write your code in Python or Java, and it will orchestrate it. You can create your features yourself easily and gives you more abilities to make your solution run quicker, eliminating the dependence from the vendor. I would rate IBM InfoSphere DataStage an eight out of ten.
I rate this solution an eight out of 10.
Informatica provides a cloud-based deployment but we only work with the on-premises version. This is a product that I can recommend. I would rate this solution a six out of ten.
I am not a developer, I have a team within our company for that. There is a cloud migration strategy going on, so they are thinking of moving to the cloud. They want a tool that is not heavy and suitable for their budget. The recommendation for using this tool would depend on the requirements. I don't have anything bad to say about this product. I would rate this solution an eight out of ten.
This product has a lot of good features. I would rate this solution an eight out of ten.
My advice for anyone considering IBM Infosphere Datastage is to use a decent consulting house to help you once you get around to committing to the product. Do not assume that you will be able to go at this alone unless you have an extremely talented staff. On a scale from one to ten (where one is the worst and ten is the best), I would rate this product as a seven-out-of-ten.
I think DataStage is a product that one should look at as a good candidate in this segment. I would rate this product a seven out of 10.
We use the on-premises deployment model. If you are comparing the solution to Informatica, this solution is much simpler. In Informatica, for example, there might be two to three ways to find a log, but with DataStage, they make it much easier. However, compared to other vendors, IBM's licensing costs are more expensive. I'd rate the solution eight out of ten.
Try to do a lot of training before beginning the project. I would rate this solution as seven out of ten because I think there are some problems with the design of the interface. They should also develop new connectors. The solution is quite complex for people and you need a lot of training to use it.
I would rate InfoSphere an eight out of 10.
It is the best solution in the IBM environment. It uses IBM data models, such as data quality tools.
The last version I interacted with was 11.3 because the later versions were cloud-based and usually our customers didn't want to use the solution on the cloud. In terms of advice, I would give to anyone trying to implement the solution is this: you to have accurate sizing. Clients always do the sizing wrong and they need more experience to get the sizing right. Setting up the environments takes sizing into account but it usually makes a lot of problems if the sizing is poor when it starts to operate. Then you have re-implement and it will require an increase in resources that will change your budget. I would rate this solution nine out of ten.
I would rate this particular product as a nine out of ten. It is very powerful and very fast, but the problems with the interface make it less than perfect. As far as other advice that I would have for other people considering this as a solution, the first and most important is to examine your needs and decide on the processes you want to build. From that, you can immediately have a better idea of the type of solution that might be best for you. Then it is a good idea to get the advice of a consultant — like us.
This is a good product, but there is room for improvement. I would rate this solution an eight out of ten.
The advice I would give to others is to make sure they define a framework for development and for management. This could be very useful for the future of the product in the company. I would rate the entire solution eight out of ten. I really like DataStage. The product fits our requirements perfectly. We are changing the product now, however, to a cloud-based approach for DataStage.