Data asset management engineer at a tech services company with 1-10 employees
Real User
Top 10
2025-01-15T16:32:00Z
Jan 15, 2025
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly. They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in. It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak. Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
Previously, we used to extract the information in the DSX and the XML formats. IBM Cloud Pak for Data exports information mostly on the ISX, which is an encrypted format. The only challenge with the tool is the metadata queries we try to understand. We have to go with the lineage and other packages that come with IBM. Previously, we created our own reports depending on the existing command line export of the mappings. The solution's catalog searching or map search needs to be improved.
The product must improve its performance. We see typical cloud-related issues in the solution. IBM can still focus more on keeping the performance up and keeping it 100% available all the time.
There are several specific connectors that we need to use, such as the one for SAP or XML. These connectors are not fully integrated into Cloud Pak for Data. However, they are very useful in database and information services for many of the projects I've worked on. Right now, the product is trying to be more maturity in terms of connectors. That, I believe, is an area where Cloud Pak can improve. Obviously, they are constantly working on refining the product. We are currently on version 4.5, and I have good relationships with some people at IBM. They are actively striving to iterate and release new versions of the product. They are also focusing on improving Information Server and Rapid Stage 5. In future releases, it would be beneficial to have more advanced data curation features. Specifically, I'm referring to data analysis and the quality dimensions associated with it. In my experience, this aspect is not as mature in Cloud Pak for Data compared to Information Server or database information analyzer, as they have been working more extensively on these areas. They have been more focused on developing and enhancing those specific aspects. However, based on my research and discussions with my peers at IBM, I believe these features will be included in Cloud Pak in the near future.
Lead Architect at a financial services firm with 10,001+ employees
Real User
Top 20
2023-04-24T12:27:00Z
Apr 24, 2023
The tool depends on the control plane, an OpenShift container platform utilized as an orchestration layer. However, for our organization, it is not a standard Kubernetes orchestration layer that we are currently using. So, we have communicated this issue to IBM and asked if it is feasible to adapt the solution to work on a Kubernetes platform that we support.
One challenge I'm facing with IBM Cloud Pak for Data is native features have been decommissioned, such as XML input and output. Too many changes have been made, and my company has around one hundred thousand mappings, so my team has been putting more effort into alternative ways to do things. Another area for improvement in IBM Cloud Pak for Data is that it's more complicated to shift from on-premise to the cloud. Other vendors provide secure agents that easily connect with your existing setup. Still, with IBM Cloud Pak for Data, you have to perform connection migration steps, upgrade to the latest version, etc., which makes it more complicated, especially as my company has XML-based mappings. Still, the XML input and output capabilities of IBM Cloud Pak for Data have been discontinued, so I'd like IBM to bring that back.
IBM Data & IA Technology Consultant at a tech services company with 10,001+ employees
Real User
2022-05-15T17:07:36Z
May 15, 2022
One thing that bugs me is how much infrastructure Cloud Pak requires for the initial deployment. It doesn't allow you to start small. The smallest permitted deployment is too big. It's a huge problem that prevents us from implementing the solution in many scenarios. It's expected to grow, but we can't start with a large deployment. Usually, the customers go win another option, so we miss the opportunity to implement this platform.
Software Consultancyy at Tata Consultancy Services
Real User
2022-03-30T20:10:54Z
Mar 30, 2022
There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement. The automated machine learning models can be created based on combining those two models on the IBM Cloud. This is an area that is needing improvement. AIOps learning is used to manage the different versions of models of the data using these same models that have been made. This needs to improve. One large hurdle we face is the AI lifecycle, this is a feature I would like to see. IBM needs to simplify the installation process and the administration process, which should be more streamlined. There should be more focus on decoupling the individual solutions to allow customer flexibility. The customer might not have all of it, but now they can disable it. The lifecycle data provides an entire range starting from collection of data, organizing, analysis and improving it. This happens in real life, where customers do not like to put all their eggs in the same basket. They need a diversified platform, so they may select IBM Cloud Pak for Data for the sorting processes, then for the machine learning, they can do it using Watson Machine Learning and Watson Studio, whereas maybe for the design innovation part, they may go for some other solution. This homogeneity, or the diversification, they should be able to achieve All of their solutions should be made in a fashion that can be plug and play. The installation or the setup process should not be complex, and integration with other solutions should be available.
Director at a university with 1,001-5,000 employees
Real User
2020-05-27T16:23:39Z
May 27, 2020
The utilization of system resources is high. The technical support could be a little better. Having a "lite" version for a reduced price would be of interest to smaller companies.
IBM Cloud Pak® for Data is a fully-integrated data and AI platform that modernizes how businesses collect, organize and analyze data to infuse AI throughout their organizations. Cloud-native by design, the platform unifies market-leading services spanning the entire analytics lifecycle. From data management, DataOps, governance, business analytics and automated AI, IBM Cloud Pak for Data helps eliminate the need for costly, and often competing, point solutions while providing the information...
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated. There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly. They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in. It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak. Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
Previously, we used to extract the information in the DSX and the XML formats. IBM Cloud Pak for Data exports information mostly on the ISX, which is an encrypted format. The only challenge with the tool is the metadata queries we try to understand. We have to go with the lineage and other packages that come with IBM. Previously, we created our own reports depending on the existing command line export of the mappings. The solution's catalog searching or map search needs to be improved.
The product must improve its performance. We see typical cloud-related issues in the solution. IBM can still focus more on keeping the performance up and keeping it 100% available all the time.
There are several specific connectors that we need to use, such as the one for SAP or XML. These connectors are not fully integrated into Cloud Pak for Data. However, they are very useful in database and information services for many of the projects I've worked on. Right now, the product is trying to be more maturity in terms of connectors. That, I believe, is an area where Cloud Pak can improve. Obviously, they are constantly working on refining the product. We are currently on version 4.5, and I have good relationships with some people at IBM. They are actively striving to iterate and release new versions of the product. They are also focusing on improving Information Server and Rapid Stage 5. In future releases, it would be beneficial to have more advanced data curation features. Specifically, I'm referring to data analysis and the quality dimensions associated with it. In my experience, this aspect is not as mature in Cloud Pak for Data compared to Information Server or database information analyzer, as they have been working more extensively on these areas. They have been more focused on developing and enhancing those specific aspects. However, based on my research and discussions with my peers at IBM, I believe these features will be included in Cloud Pak in the near future.
The tool depends on the control plane, an OpenShift container platform utilized as an orchestration layer. However, for our organization, it is not a standard Kubernetes orchestration layer that we are currently using. So, we have communicated this issue to IBM and asked if it is feasible to adapt the solution to work on a Kubernetes platform that we support.
The solution's user experience is an area that has room for improvement.
Cloud Pak would be improved with integration with cloud service providers like Cloudera.
The solution could have more connectors. Sometimes the customers request additional things that are not implemented, like Data Catalog.
One challenge I'm facing with IBM Cloud Pak for Data is native features have been decommissioned, such as XML input and output. Too many changes have been made, and my company has around one hundred thousand mappings, so my team has been putting more effort into alternative ways to do things. Another area for improvement in IBM Cloud Pak for Data is that it's more complicated to shift from on-premise to the cloud. Other vendors provide secure agents that easily connect with your existing setup. Still, with IBM Cloud Pak for Data, you have to perform connection migration steps, upgrade to the latest version, etc., which makes it more complicated, especially as my company has XML-based mappings. Still, the XML input and output capabilities of IBM Cloud Pak for Data have been discontinued, so I'd like IBM to bring that back.
One thing that bugs me is how much infrastructure Cloud Pak requires for the initial deployment. It doesn't allow you to start small. The smallest permitted deployment is too big. It's a huge problem that prevents us from implementing the solution in many scenarios. It's expected to grow, but we can't start with a large deployment. Usually, the customers go win another option, so we miss the opportunity to implement this platform.
There is a solution that is part of IBM Cloud Pak for Data called Watson OpenScale. It is used to monitor the deployed models for the quality and fairness of the results. This is one area that needs a lot of improvement. The automated machine learning models can be created based on combining those two models on the IBM Cloud. This is an area that is needing improvement. AIOps learning is used to manage the different versions of models of the data using these same models that have been made. This needs to improve. One large hurdle we face is the AI lifecycle, this is a feature I would like to see. IBM needs to simplify the installation process and the administration process, which should be more streamlined. There should be more focus on decoupling the individual solutions to allow customer flexibility. The customer might not have all of it, but now they can disable it. The lifecycle data provides an entire range starting from collection of data, organizing, analysis and improving it. This happens in real life, where customers do not like to put all their eggs in the same basket. They need a diversified platform, so they may select IBM Cloud Pak for Data for the sorting processes, then for the machine learning, they can do it using Watson Machine Learning and Watson Studio, whereas maybe for the design innovation part, they may go for some other solution. This homogeneity, or the diversification, they should be able to achieve All of their solutions should be made in a fashion that can be plug and play. The installation or the setup process should not be complex, and integration with other solutions should be available.
The utilization of system resources is high. The technical support could be a little better. Having a "lite" version for a reduced price would be of interest to smaller companies.