The platform can sometimes be unstable, particularly in terms of speed and reconnection issues. There's also potential for improvements in AI and AI governance, which I don't see clear use cases for yet.
Governance / Data Governance Initiative Advisory and Strategy Leadership at a consultancy with 1-10 employees
Real User
Top 20
2024-05-22T14:49:00Z
May 22, 2024
They could simplify the platform's setup and design process. It requires significant IT involvement and expertise, which can be a barrier for smaller organizations.
Technical consultant at a energy/utilities company with 10,001+ employees
Real User
Top 20
2024-04-03T04:23:00Z
Apr 3, 2024
The solution's search functionality can be improved. For instance, if I search with one piece of data, the tool provides a granular level of data and sometimes billions of data results, among which it's extremely difficult to find a relation. Instead of the aforementioned type of results, the solution can provide a hierarchy based on levels that reveal data upon click.
There could be more integration similar to Snowflake, for instance. They do support integration, but complete lineage is not possible. Certain implementations, including lineage maps, could be user-friendly. At the moment, there needs to be more lineage between data sources. It needs improvements in terms of attribute connectivity during data transformation.
Collibra gives a lot of facilities in the cloud, but achieving those facilities on-prem becomes a big challenge. Collibra should loosen up those areas there and try to give stakes for that. The uses can vary. They can be limited to a small amount, or they can be a huge chunk. If Collibra thinks the utilization should be controlled and monitored by them on a cloud level, making for highly scalable data, it's good for them. But when the customer needs it, Collibra will be in the minute volume, in which case they will go for the on-premise service. That's what I feel Collibra lacks because some things cannot be integrated for on-premise customers.
There are certain limitations and difficulties regarding the migration of complex data quality rules, as the tool may struggle with lengthy calculations and longer loading times. Our clients reported that they have experienced runtime errors in those scenarios, Also, setting up the tool and seeing real results can take a significant amount of time and manual effort. Many clients struggle to find experienced consultants due to the tool's extensive features, and this shortage impacts the deployment process and overall value realization. The pricing structure is significantly expensive and not affordable for all the clients. I would suggest that Collibra consider offering a lighter version of the tool for small companies. Currently, the main focus of their strategy lies in targeting medium and large enterprises.
Consulting Principal & Founder at Digital Data Consultancy
Real User
Top 20
2023-08-14T09:29:14Z
Aug 14, 2023
Pricing policy of the product is an area with certain shortcomings that needs improvement. I would like to see better pricing in future releases of the product. Collibra Governance needs to develop some new technology, like Atlan, that has been incorporating AI in the automation area of its products.
The technical support could be better. Previously, it was a challenge to integrate tools like MuleSoft into the product. However, they've gotten better. There needs to be more security and compliance embedded in the product in the future.
Knowledge Manager at The Church of Jesus Christ of Latter-day Saints
Real User
2021-09-03T18:29:00Z
Sep 3, 2021
The UI is good if you happen to be an administrator and are familiar with the technical side of the administration. If you're a business user, the UI is not good. It is hard to learn. It is hard for those who are administering it to teach to end-users and it can take hours of training to do it. Because it is difficult and non-intuitive, business users resist using it. It is a battle to get them on board and to keep them engaged because of the UI. On the other hand, Collibra just hired a person specifically to revamp the UI. So, they're dealing with it, but it isn't there yet. They're working on the lineage harvesting for technical lineages. I don't know this for a fact, but my feeling is that this is new to them. So, they're still developing it and it feels awkward.
There are certain processes that involve a lot of manual work, specifically when searching for attribute assignments. It would be better if we could do this via APIs. In a future release, we would like to have rules. We would like to have Regex Expression that scans data and to monitor quality.
Many other tools like Collibra Governance give you the option to actually see the data in the tool, which is not an option in Collibra. If you compare it with Alation, which has the compose data option, you can see the data right there. Perhaps this is a feature they could include in the future, although I'm not sure if it is in their roadmap or not, because it is just a governance tool for storing metadata, and not for connecting actual data. If they could include some type of data viewing functionality, however, I would appreciate it.
Senior Manager, Service Design Manager at a pharma/biotech company with 10,001+ employees
Real User
2022-09-14T13:47:22Z
Sep 14, 2022
There's still room for improvement in Collibra Governance. Right now, the solution focuses more on publishing and subscription where metadata is being consumed from all systems and processes within Collibra Governance. I'd like to see more features around external systems that let those systems start consuming or start referring things from Collibra Governance automatically, in particular, more API-based integrations, so the solution can truly work as an enterprise governance tool at the heart of the organization. There have been many cases where business stakeholders feel, "Okay. I'll be doing a manual activity because every time I have to maintain and make sure that the solution is working, for example, ensuring the reporting works, I have to go to Collibra Governance." I want this step taken out of the equation, and it has to be an automatic conversion or automatic notification that something has changed. Scalability also needs improvement in the solution. If there could be some sort of self-service where people can customize the solution a bit more, that would make Collibra Governance better. What I'd like to see in the next release of the solution is an enhanced search feature because it's what users want to use. You want to search for a term and get all details in one go. Right now, if you type a term in the Collibra Governance global search function, it returns a lot of results that you need to know how to filter and what to filter, so there should be automatic filters that give accurate information to somebody who does a search on the solution. I want Collibra Governance to provide a capability where you can search and get relevant information.
I would like to see a feature using the runtime dashboard. As of now, we can create the dashboard using the snapshot of the report. But I would recommend having a dashboard that runs the data on the back end.
Data Quality & Data Governance Developer at Accenture
Real User
2022-07-19T07:46:59Z
Jul 19, 2022
This solution does not have an out-of-the-box connector for the cloud platform. If a client does not want their data sent out of their own server, it will be very difficult to ensure this. You can install Collibra Jobserver and DGC on your machine, but you cannot install a newer solution. This is a restriction for us.
Data discovery would be the major area that requires improvement across the landscape and I'd also like to see data redundancies. Those are the two areas where improvements could be made. A lot of countries have laws that don't allow most of the data to flow into the cloud. That issue needs to be addressed when it comes to personal data protection laws. Collibra needs to take into consideration the local laws and the concerns of those countries.
What I use it for is fairly rudimentary, and I don't have any complaints about it. I haven't tried to stretch the boundaries at all, but it would be nice if there were capabilities built within the system to somehow help enforce the quality and consistency across related elements that are built in the catalog. It could have intelligent capabilities built in to help maintain the quality of the data and information, such as natural language processing, machine learning, and so on.
Education Mentor - Data Science at a tech services company with 11-50 employees
Real User
2022-02-06T07:27:56Z
Feb 6, 2022
When we are doing discovery for unstructured data sources, I would suggest a quality report of the data that would indicate what was unable to process. The main issue is the data quality and that we can't control dealing with the real-world data. It's good to measure this and compare it between different data sources. The Collibra products are very consolidated in the market. I think it's harder to analyze different domains of the data. Through Data Lineage, I think Collibra could add a relationship graphic analysis. They could add a feature that is a new way to explore the data.
Sr. Systems Analyst, Master Data Governance at a manufacturing company with 10,001+ employees
Real User
2021-08-03T19:50:03Z
Aug 3, 2021
It's not necessarily a tool specific, however, with any sort of application, there's an investment as far as the way in which you need to use it. There is a lot of upfront work that has to be considered. That's just a common reality with any software implementation. There's a lot of pre-work. You just don't turn on the lights assume it's going to work exactly as you envisioned. There is input and planning required. If anything, I would say that the licensing is one area that could get improved. We have basically three roles: an admin, an editor, and a view-only role. It is limiting. For example, we want view-only, however, if we want users to be able to approve workflows, they need editor rights. That makes sense, except it doesn't necessarily meet all the business cases we have. In some instances, you might just need proper approvals, and you are not necessarily asking anyone to edit things. Yet in order for them to approve, they must have edit rights. The last implementation was very much focused more on IT and capturing more of the IT view of data and even data definitions really focused on data standards, such as how we're going to name the technical fields or how we're going to name the entities. This new deployment is really much more focused on not just the IT side but on the business side and the operational side. It's based more so around analytics and operational governance. I'm hoping to use more of the modules and have a better, more favorable opinion of the solution's capabilities. While overall I have the sense it's good, the last company I was with didn't have the right business partners and it really just became another IT tool, which wasn't helpful to the company as a whole. The initial setup requires more of a trial and error approach and there isn't too much documentation available to help you figure things out. There needs to be more online support around the sharing of best practices. There are a lot of use cases and people like the tool. That said, you hear a lot of pain points around large amounts of data being ingested and creating backlogs of data that need to be cataloged and there's really no way to prioritize it. Ultimately, it's a tool that should help to coordinate a lot of efforts and it would be nice to be able to look at something and understand how another experience could be similar or you can get a lesson learned before you actually make it your own lesson to learn. This is more of a data governance tool, not necessarily a centralized tool for data cleansing. However, with the data quality module, that's the next evolution that's possible. Looking at data quality issues and then ultimately not necessarily being able to correct them, there's a lost opportunity. Data changes all the time. We're measuring it all the time. It would be advantageous to build this into more of a data quality tool in which users could cleanse data that could go back to source systems. That said, that's encroaching on more of the MDM solution.
Technical Product Lead at a insurance company with 5,001-10,000 employees
Real User
2020-12-21T20:50:00Z
Dec 21, 2020
Collibra is very good at talking to modern database systems such as a normal RDBMS (e.g.DB2, SQL server or Oracle). Where it isn't great is with older technologies that you'll typically find in finance or insurance industries (e.g. VSAM or ISAM, or those types of older technologies). It just doesn't connect with them very easily. They do provide an ability to use a separate product called MuleSoft, which they used to license (as a bundle) up until last year until Salesforce bought MuleSoft, and that division is happening in 2021. With this 'bolt-on', you could go and get that data, but you had to write that code and maintain it yourself. It wasn't an out-of-box (OOB) feature, which is what we really liked from the Collibra offering. Our only way to access these older technologies was to create a MuleSoft flow, maintain, and deploy it. This leaves us with technical debt which will need to continually be maintained. In fact, we built all our custom Mulesoft flows using Mule 3.x and will soon be pushed to upgrade to Mule 4.x. This will not be a simple upgrade and will likely result in additional cost to bring in consulting resources more familiar with the technology. Since we do have a lot of older legacy systems, things that aren't greenfield, if you will, it adds a lot more overhead than what we were originally led to believe when we originally purchased the product. We're not that deep into the Collibra product yet because it's only been a couple of years. We do like their ability to automate the workflows, such that, for example, if somebody comes in to say, "I want to request access to this data," you can build your own workflows to automate the approval process. There are some that are out-of-box, I think they could go a little bit further with some of their out-of-box workflows instead of having to create a workflow manually, get somebody to code it, and implement it. I think they could offer a bit more in that respect. The second item that I think they could do better at is to have other products, or have things where they have a set of taxonomy per industry that says, "Here's what a policy is. Here's what a customer is," that kind of thing. They don't implement that out-of-box in Collibra, you have to do that yourself, whereas other products bring that to the table. Informatica, I believe, has their own insurance industry or industry specific taxonomy that would come with the product. It makes adding the new logical constructs to Collibra a more manual workup to take care of. The classification becomes more manual because you don't get that out-of-box to say, "Hey, I recognize that that's a policy, because I know that about that and the taxonomy." You have to manually make that connection.
Consultant at a tech services company with 10,001+ employees
Real User
2020-12-20T20:51:30Z
Dec 20, 2020
I'm fairly new to the product, however, what I generally hear from my clients is that the requirement around having ways to ingest more metadata. Currently, with Collibra, they provide you a catalog platform, which helps you integrate or get metadata from a few commonly known platforms, like Tableau and IBM Db2, and Informatica. If they could bring them through, or if they could bring in more connectors to help us ingest metadata from other systems as well, that would be really helpful. That would reduce a lot of time and effort from our end. If people had backward compatibility as well, that would be much better. I've also worked on other technologies, primarily Java, which is very, very much backward compatible. Any new implementation which they bring in does not impact your existing work to a heavy extent. It would be helpful if Collibra was similar.
The solution needs to be controlled. It can go sometimes out of hand. The speed sometimes, especially now, since we have moved to the Collibra Cloud, has not been the best. The management of the speed of the tool is not that great. It's also partially impacted by the fact that we need to use a VPN and we have got a lot of security measures. Sometimes it's not working well together with everything else. That is the main pain point that we are having. Occasionally we get little bugs that occur, however, this is typical. We would like to have a data lineage feature. It's just like on a different module. That's already available, as well as some advanced connectors. From my perspective, I would like to see improvement in the dashboard creation, to make it easier to create a really nice dashboard, and to also be able to play with the user interface when it comes to those dashboards.
Sr Manager - Enterprise Data Office at a healthcare company with 10,001+ employees
Real User
2020-12-15T09:01:00Z
Dec 15, 2020
The connectors are not very sophisticated. They can do, for example, Informatica and Tableau, but the connectors themselves could be improved. I recently got a subscription for another 600K for Collibra for one more year, so the author licenses are not used much. And they keep changing the UI platform; that can also be improved. From an administration perspective, I like the white-glove onboarding part of Collibra. That was actually nice and I really liked that. For administration in general, I like that you can use Collibra however you want. It's more raw and easily adaptable. So you can cook it or you can steam it or you can make changes to it in a lot of different ways, but it would also be nice if there were an already available analytics tools like Tableau at hand. Though it is easily adaptable and you'll have a completed end product which you can really leverage.
Solution Architect at a financial services firm with 10,001+ employees
Real User
2020-12-15T05:25:45Z
Dec 15, 2020
There are many new aspects of the solution, however, I haven't yet gone through the documentation to see if they really help solve for issues or not. Many features have recently changed their appearance and I need to re-learn how they work. Sometimes, if a client needs a specific customization, we cannot do it directly. The client needs to reach out to Collibra and request the customization. The technical support is very poor.
Data Governance Manager at a insurance company with 201-500 employees
Real User
2020-12-13T19:34:43Z
Dec 13, 2020
Collibra, as far as I know, does not have a connector like Oracle, or a mainframe. It's important to have a connector so that you have access to up-to-date information. Sometimes the data can be out-of-date as the updates are not automatic. Users could be looking at obsolete information. You need to be precise about the names of the field and you have to develop them yourself. It's my understanding that they are working on a solution where you can import all the information that you need from a data validation too, or from a CRM. It's something they really need to get better at. It would be better if there was a way to import all data and metadata in an automatic way in one block form.
I am a business person — I am a team leader. My duty is to ensure that the data governance processes are set up; that's how I started to use Collibra. There are certain limitations I have observed in Collibra. With regards to our data lake, Collibra doesn't give us direct connectivity to the Azure Data Lake. We have to establish data lineages. We have to browse those files manually and then connect them via Collibra — that's how data dictionaries get published. Overall, it's quite a manual type of process which needs a lot of human intervention. I've been hearing that tools like Talent are going to be available soon, which we hope to leverage in the near future. Talent is similar to other ETL or Informatica-type tools. It directly connects to the source system, captures all the transformation tools, and provides you with a spreadsheet that talks about data lineage, which can be fed into Collibra. If this functionality could be improved, it would be a great time-saving solution. It would require less effort and it would be a more automated kind of system, less dependent on human operation, which means that it would be less prone to errors as well. We create and issue the management of workflows with Collibra. In regards to workflows, I find that they can be made very simple. For example, a request goes directly to the person who is in charge of that particular asset and some simpler workflows can be assigned to it. Recently, I find that the default process of issue management in Collibra is really complex — It wasn't really helpful to us.
One problem is the data lineage, especially extracting the ETL transformation from different ETL tools and identifying how the data is getting changed from one layer to different layers and how the transformation is applied. It doesn't support all the ETL tools for extracting the transformation logic. It supports some of the tools, but there are still some tools that need to be supported. There is also a small pain point in terms of integration. There is a little bit of change in their strategy from Collibra's end. Earlier, they used to offer two solutions. One was out of the box, and one was a custom-built solution for which they used to provide a dual connector. Now the focus from the Collibra side is more on using the out of the box connector. They are discouraging doing the custom integration. That leaves us with two problems. The first problem is that the out-of-the-box connecter is not yet enabled for a lot of systems, and the second problem is that the out-of-the-box connecter has certain limitations. If we want to tweak those as per our needs, it is not possible. However, the custom-built is still supported, and you can still build a custom integration by using the API, but it is not very encouraged by Collibra. Its dashboard also needs to be improved. There are options to use the HTML code to customize your dashboard, but it has a lot of limitations.
It should have more integrations with things like CyberArk because its main purpose is GDPR implementation. We have to have more scope for things that implement more privacy. CyberArk makes sure your credentials are vaulted and your things are secure when you're creating your integrations or connecting to an application. I do believe that they are working on this feature.
The workflows and the language they use needs to be improved. Programming the needs for every user on the workflows is a key improvement that is required. In addition, they haven't updated their training solution in a while. We need to implement a lot of things ourselves and they want us to move to the cloud but there are a lot of glitches in the system. There are three environments - stage, development and production. Often things work well in the first two stages and then when you get to production, they don't work. It happens a lot and their response is slow.
Business Analyst at a financial services firm with 1,001-5,000 employees
Real User
2020-12-07T14:31:05Z
Dec 7, 2020
We have an issue with metadata history. If someone changes the metadata, we can't see who changed it. But they are trying to upgrade the system with this feedback and are still working on it. We are still waiting for a proper log to maintain the solution.
Manager - Finance at a financial services firm with 10,001+ employees
Real User
2020-12-06T16:07:14Z
Dec 6, 2020
The issue may be the way it's been implemented in my company but, for Collibra to be really useful, what's missing is an easy way to connect to different data sources and different types of data sources and actually ingest and profile some of that data. That's the trouble we've always had in getting wider adoption of the tool. Unless there's a mandate from the enterprise data office or the like, regular users are not going to use the tool for really robust business use cases without having some actual data in there. I know there is some out of the box capability for this, but I think it needs to be easier for Collibra to actually ingest and run some basic profiling on the data itself. That's currently missing from the tool.
Consultant II at Datasource Consulting, an EXL company
Consultant
2020-12-06T14:20:49Z
Dec 6, 2020
While connecting with the data source, it's not very easy. If there's a firewall, it is difficult to connect with the database. It's not easy when you are configuring on the database. Right now, the client is decommissioning the MuleSoft integration and they're moving to APIs. Collibra Connect and MuleSoft integration were there before, however, now there's a move to API. Within a year or two, they will all move to API. Whoever is using it now with MuleSoft and Collibra Connect needs to find another way for connecting with the API. I don't think they are providing additional software for MuleSoft integration. Primarily, they are telling us, okay, we will decommission this and move to API. The only thing that's lacking in terms of the change is when connecting to database. Sometimes the connection causes issues if the data is breaking the firewall and ingesting the data.
The breadth of available connectors for metadata ingestion need to grow quickly to support customers as they expand their data governance programs to include a diverse list of source systems from which they want to derive business value. The connectors are needed to bring metadata into Collibra and enable lineage, workflows, definitions, etc. That said, this is not just a Collibra problem - this is an everybody problem. The central challenge is the availability of APIs to ingest text structural metadata, which is a common problem across any data governance platform or even any integration platform, honestly. To be fair, I would say that Collibra's purpose and primary value is as a collaboration platform, which is the core value of business-centric data governance, and not as an integration platform. For this purpose, they are clearly the leading solution.
Collibra Governance is a software solution for data governance, which refers to the set of policies, standards, and processes that govern how an organization manages, uses, and protects its data. Collibra Governance provides a centralized platform for managing data governance, enabling organizations to ensure data accuracy, completeness, and security.
The software includes tools for managing data lineage, data dictionaries, and metadata, as well as for monitoring data quality and compliance...
The platform can sometimes be unstable, particularly in terms of speed and reconnection issues. There's also potential for improvements in AI and AI governance, which I don't see clear use cases for yet.
When you have a huge amount of hierarchical data to manage, the solution's performance tends to decrease.
They could simplify the platform's setup and design process. It requires significant IT involvement and expertise, which can be a barrier for smaller organizations.
The solution's search functionality can be improved. For instance, if I search with one piece of data, the tool provides a granular level of data and sometimes billions of data results, among which it's extremely difficult to find a relation. Instead of the aforementioned type of results, the solution can provide a hierarchy based on levels that reveal data upon click.
We are not able to ingest all the data in Collibra, and that's why we cannot do element-to-element level data tracking.
The solution's metadata management is pretty novice and could be improved.
There could be more integration similar to Snowflake, for instance. They do support integration, but complete lineage is not possible. Certain implementations, including lineage maps, could be user-friendly. At the moment, there needs to be more lineage between data sources. It needs improvements in terms of attribute connectivity during data transformation.
Collibra gives a lot of facilities in the cloud, but achieving those facilities on-prem becomes a big challenge. Collibra should loosen up those areas there and try to give stakes for that. The uses can vary. They can be limited to a small amount, or they can be a huge chunk. If Collibra thinks the utilization should be controlled and monitored by them on a cloud level, making for highly scalable data, it's good for them. But when the customer needs it, Collibra will be in the minute volume, in which case they will go for the on-premise service. That's what I feel Collibra lacks because some things cannot be integrated for on-premise customers.
There are certain limitations and difficulties regarding the migration of complex data quality rules, as the tool may struggle with lengthy calculations and longer loading times. Our clients reported that they have experienced runtime errors in those scenarios, Also, setting up the tool and seeing real results can take a significant amount of time and manual effort. Many clients struggle to find experienced consultants due to the tool's extensive features, and this shortage impacts the deployment process and overall value realization. The pricing structure is significantly expensive and not affordable for all the clients. I would suggest that Collibra consider offering a lighter version of the tool for small companies. Currently, the main focus of their strategy lies in targeting medium and large enterprises.
Pricing policy of the product is an area with certain shortcomings that needs improvement. I would like to see better pricing in future releases of the product. Collibra Governance needs to develop some new technology, like Atlan, that has been incorporating AI in the automation area of its products.
The technical support could be better. Previously, it was a challenge to integrate tools like MuleSoft into the product. However, they've gotten better. There needs to be more security and compliance embedded in the product in the future.
This solution could be improved with the the addition of process diagrams to help the many users of the platform understand all the fields.
The UI is good if you happen to be an administrator and are familiar with the technical side of the administration. If you're a business user, the UI is not good. It is hard to learn. It is hard for those who are administering it to teach to end-users and it can take hours of training to do it. Because it is difficult and non-intuitive, business users resist using it. It is a battle to get them on board and to keep them engaged because of the UI. On the other hand, Collibra just hired a person specifically to revamp the UI. So, they're dealing with it, but it isn't there yet. They're working on the lineage harvesting for technical lineages. I don't know this for a fact, but my feeling is that this is new to them. So, they're still developing it and it feels awkward.
There are certain processes that involve a lot of manual work, specifically when searching for attribute assignments. It would be better if we could do this via APIs. In a future release, we would like to have rules. We would like to have Regex Expression that scans data and to monitor quality.
Many other tools like Collibra Governance give you the option to actually see the data in the tool, which is not an option in Collibra. If you compare it with Alation, which has the compose data option, you can see the data right there. Perhaps this is a feature they could include in the future, although I'm not sure if it is in their roadmap or not, because it is just a governance tool for storing metadata, and not for connecting actual data. If they could include some type of data viewing functionality, however, I would appreciate it.
There's still room for improvement in Collibra Governance. Right now, the solution focuses more on publishing and subscription where metadata is being consumed from all systems and processes within Collibra Governance. I'd like to see more features around external systems that let those systems start consuming or start referring things from Collibra Governance automatically, in particular, more API-based integrations, so the solution can truly work as an enterprise governance tool at the heart of the organization. There have been many cases where business stakeholders feel, "Okay. I'll be doing a manual activity because every time I have to maintain and make sure that the solution is working, for example, ensuring the reporting works, I have to go to Collibra Governance." I want this step taken out of the equation, and it has to be an automatic conversion or automatic notification that something has changed. Scalability also needs improvement in the solution. If there could be some sort of self-service where people can customize the solution a bit more, that would make Collibra Governance better. What I'd like to see in the next release of the solution is an enhanced search feature because it's what users want to use. You want to search for a term and get all details in one go. Right now, if you type a term in the Collibra Governance global search function, it returns a lot of results that you need to know how to filter and what to filter, so there should be automatic filters that give accurate information to somebody who does a search on the solution. I want Collibra Governance to provide a capability where you can search and get relevant information.
I would like to see a feature using the runtime dashboard. As of now, we can create the dashboard using the snapshot of the report. But I would recommend having a dashboard that runs the data on the back end.
This solution does not have an out-of-the-box connector for the cloud platform. If a client does not want their data sent out of their own server, it will be very difficult to ensure this. You can install Collibra Jobserver and DGC on your machine, but you cannot install a newer solution. This is a restriction for us.
The price of Collibra Governance could improve.
Data discovery would be the major area that requires improvement across the landscape and I'd also like to see data redundancies. Those are the two areas where improvements could be made. A lot of countries have laws that don't allow most of the data to flow into the cloud. That issue needs to be addressed when it comes to personal data protection laws. Collibra needs to take into consideration the local laws and the concerns of those countries.
What I use it for is fairly rudimentary, and I don't have any complaints about it. I haven't tried to stretch the boundaries at all, but it would be nice if there were capabilities built within the system to somehow help enforce the quality and consistency across related elements that are built in the catalog. It could have intelligent capabilities built in to help maintain the quality of the data and information, such as natural language processing, machine learning, and so on.
When we are doing discovery for unstructured data sources, I would suggest a quality report of the data that would indicate what was unable to process. The main issue is the data quality and that we can't control dealing with the real-world data. It's good to measure this and compare it between different data sources. The Collibra products are very consolidated in the market. I think it's harder to analyze different domains of the data. Through Data Lineage, I think Collibra could add a relationship graphic analysis. They could add a feature that is a new way to explore the data.
It's not necessarily a tool specific, however, with any sort of application, there's an investment as far as the way in which you need to use it. There is a lot of upfront work that has to be considered. That's just a common reality with any software implementation. There's a lot of pre-work. You just don't turn on the lights assume it's going to work exactly as you envisioned. There is input and planning required. If anything, I would say that the licensing is one area that could get improved. We have basically three roles: an admin, an editor, and a view-only role. It is limiting. For example, we want view-only, however, if we want users to be able to approve workflows, they need editor rights. That makes sense, except it doesn't necessarily meet all the business cases we have. In some instances, you might just need proper approvals, and you are not necessarily asking anyone to edit things. Yet in order for them to approve, they must have edit rights. The last implementation was very much focused more on IT and capturing more of the IT view of data and even data definitions really focused on data standards, such as how we're going to name the technical fields or how we're going to name the entities. This new deployment is really much more focused on not just the IT side but on the business side and the operational side. It's based more so around analytics and operational governance. I'm hoping to use more of the modules and have a better, more favorable opinion of the solution's capabilities. While overall I have the sense it's good, the last company I was with didn't have the right business partners and it really just became another IT tool, which wasn't helpful to the company as a whole. The initial setup requires more of a trial and error approach and there isn't too much documentation available to help you figure things out. There needs to be more online support around the sharing of best practices. There are a lot of use cases and people like the tool. That said, you hear a lot of pain points around large amounts of data being ingested and creating backlogs of data that need to be cataloged and there's really no way to prioritize it. Ultimately, it's a tool that should help to coordinate a lot of efforts and it would be nice to be able to look at something and understand how another experience could be similar or you can get a lesson learned before you actually make it your own lesson to learn. This is more of a data governance tool, not necessarily a centralized tool for data cleansing. However, with the data quality module, that's the next evolution that's possible. Looking at data quality issues and then ultimately not necessarily being able to correct them, there's a lost opportunity. Data changes all the time. We're measuring it all the time. It would be advantageous to build this into more of a data quality tool in which users could cleanse data that could go back to source systems. That said, that's encroaching on more of the MDM solution.
Collibra is very good at talking to modern database systems such as a normal RDBMS (e.g.DB2, SQL server or Oracle). Where it isn't great is with older technologies that you'll typically find in finance or insurance industries (e.g. VSAM or ISAM, or those types of older technologies). It just doesn't connect with them very easily. They do provide an ability to use a separate product called MuleSoft, which they used to license (as a bundle) up until last year until Salesforce bought MuleSoft, and that division is happening in 2021. With this 'bolt-on', you could go and get that data, but you had to write that code and maintain it yourself. It wasn't an out-of-box (OOB) feature, which is what we really liked from the Collibra offering. Our only way to access these older technologies was to create a MuleSoft flow, maintain, and deploy it. This leaves us with technical debt which will need to continually be maintained. In fact, we built all our custom Mulesoft flows using Mule 3.x and will soon be pushed to upgrade to Mule 4.x. This will not be a simple upgrade and will likely result in additional cost to bring in consulting resources more familiar with the technology. Since we do have a lot of older legacy systems, things that aren't greenfield, if you will, it adds a lot more overhead than what we were originally led to believe when we originally purchased the product. We're not that deep into the Collibra product yet because it's only been a couple of years. We do like their ability to automate the workflows, such that, for example, if somebody comes in to say, "I want to request access to this data," you can build your own workflows to automate the approval process. There are some that are out-of-box, I think they could go a little bit further with some of their out-of-box workflows instead of having to create a workflow manually, get somebody to code it, and implement it. I think they could offer a bit more in that respect. The second item that I think they could do better at is to have other products, or have things where they have a set of taxonomy per industry that says, "Here's what a policy is. Here's what a customer is," that kind of thing. They don't implement that out-of-box in Collibra, you have to do that yourself, whereas other products bring that to the table. Informatica, I believe, has their own insurance industry or industry specific taxonomy that would come with the product. It makes adding the new logical constructs to Collibra a more manual workup to take care of. The classification becomes more manual because you don't get that out-of-box to say, "Hey, I recognize that that's a policy, because I know that about that and the taxonomy." You have to manually make that connection.
I'm fairly new to the product, however, what I generally hear from my clients is that the requirement around having ways to ingest more metadata. Currently, with Collibra, they provide you a catalog platform, which helps you integrate or get metadata from a few commonly known platforms, like Tableau and IBM Db2, and Informatica. If they could bring them through, or if they could bring in more connectors to help us ingest metadata from other systems as well, that would be really helpful. That would reduce a lot of time and effort from our end. If people had backward compatibility as well, that would be much better. I've also worked on other technologies, primarily Java, which is very, very much backward compatible. Any new implementation which they bring in does not impact your existing work to a heavy extent. It would be helpful if Collibra was similar.
The solution needs to be controlled. It can go sometimes out of hand. The speed sometimes, especially now, since we have moved to the Collibra Cloud, has not been the best. The management of the speed of the tool is not that great. It's also partially impacted by the fact that we need to use a VPN and we have got a lot of security measures. Sometimes it's not working well together with everything else. That is the main pain point that we are having. Occasionally we get little bugs that occur, however, this is typical. We would like to have a data lineage feature. It's just like on a different module. That's already available, as well as some advanced connectors. From my perspective, I would like to see improvement in the dashboard creation, to make it easier to create a really nice dashboard, and to also be able to play with the user interface when it comes to those dashboards.
The connectors are not very sophisticated. They can do, for example, Informatica and Tableau, but the connectors themselves could be improved. I recently got a subscription for another 600K for Collibra for one more year, so the author licenses are not used much. And they keep changing the UI platform; that can also be improved. From an administration perspective, I like the white-glove onboarding part of Collibra. That was actually nice and I really liked that. For administration in general, I like that you can use Collibra however you want. It's more raw and easily adaptable. So you can cook it or you can steam it or you can make changes to it in a lot of different ways, but it would also be nice if there were an already available analytics tools like Tableau at hand. Though it is easily adaptable and you'll have a completed end product which you can really leverage.
There are many new aspects of the solution, however, I haven't yet gone through the documentation to see if they really help solve for issues or not. Many features have recently changed their appearance and I need to re-learn how they work. Sometimes, if a client needs a specific customization, we cannot do it directly. The client needs to reach out to Collibra and request the customization. The technical support is very poor.
Collibra, as far as I know, does not have a connector like Oracle, or a mainframe. It's important to have a connector so that you have access to up-to-date information. Sometimes the data can be out-of-date as the updates are not automatic. Users could be looking at obsolete information. You need to be precise about the names of the field and you have to develop them yourself. It's my understanding that they are working on a solution where you can import all the information that you need from a data validation too, or from a CRM. It's something they really need to get better at. It would be better if there was a way to import all data and metadata in an automatic way in one block form.
I am a business person — I am a team leader. My duty is to ensure that the data governance processes are set up; that's how I started to use Collibra. There are certain limitations I have observed in Collibra. With regards to our data lake, Collibra doesn't give us direct connectivity to the Azure Data Lake. We have to establish data lineages. We have to browse those files manually and then connect them via Collibra — that's how data dictionaries get published. Overall, it's quite a manual type of process which needs a lot of human intervention. I've been hearing that tools like Talent are going to be available soon, which we hope to leverage in the near future. Talent is similar to other ETL or Informatica-type tools. It directly connects to the source system, captures all the transformation tools, and provides you with a spreadsheet that talks about data lineage, which can be fed into Collibra. If this functionality could be improved, it would be a great time-saving solution. It would require less effort and it would be a more automated kind of system, less dependent on human operation, which means that it would be less prone to errors as well. We create and issue the management of workflows with Collibra. In regards to workflows, I find that they can be made very simple. For example, a request goes directly to the person who is in charge of that particular asset and some simpler workflows can be assigned to it. Recently, I find that the default process of issue management in Collibra is really complex — It wasn't really helpful to us.
One problem is the data lineage, especially extracting the ETL transformation from different ETL tools and identifying how the data is getting changed from one layer to different layers and how the transformation is applied. It doesn't support all the ETL tools for extracting the transformation logic. It supports some of the tools, but there are still some tools that need to be supported. There is also a small pain point in terms of integration. There is a little bit of change in their strategy from Collibra's end. Earlier, they used to offer two solutions. One was out of the box, and one was a custom-built solution for which they used to provide a dual connector. Now the focus from the Collibra side is more on using the out of the box connector. They are discouraging doing the custom integration. That leaves us with two problems. The first problem is that the out-of-the-box connecter is not yet enabled for a lot of systems, and the second problem is that the out-of-the-box connecter has certain limitations. If we want to tweak those as per our needs, it is not possible. However, the custom-built is still supported, and you can still build a custom integration by using the API, but it is not very encouraged by Collibra. Its dashboard also needs to be improved. There are options to use the HTML code to customize your dashboard, but it has a lot of limitations.
It should have more integrations with things like CyberArk because its main purpose is GDPR implementation. We have to have more scope for things that implement more privacy. CyberArk makes sure your credentials are vaulted and your things are secure when you're creating your integrations or connecting to an application. I do believe that they are working on this feature.
The workflows and the language they use needs to be improved. Programming the needs for every user on the workflows is a key improvement that is required. In addition, they haven't updated their training solution in a while. We need to implement a lot of things ourselves and they want us to move to the cloud but there are a lot of glitches in the system. There are three environments - stage, development and production. Often things work well in the first two stages and then when you get to production, they don't work. It happens a lot and their response is slow.
We have an issue with metadata history. If someone changes the metadata, we can't see who changed it. But they are trying to upgrade the system with this feedback and are still working on it. We are still waiting for a proper log to maintain the solution.
The issue may be the way it's been implemented in my company but, for Collibra to be really useful, what's missing is an easy way to connect to different data sources and different types of data sources and actually ingest and profile some of that data. That's the trouble we've always had in getting wider adoption of the tool. Unless there's a mandate from the enterprise data office or the like, regular users are not going to use the tool for really robust business use cases without having some actual data in there. I know there is some out of the box capability for this, but I think it needs to be easier for Collibra to actually ingest and run some basic profiling on the data itself. That's currently missing from the tool.
While connecting with the data source, it's not very easy. If there's a firewall, it is difficult to connect with the database. It's not easy when you are configuring on the database. Right now, the client is decommissioning the MuleSoft integration and they're moving to APIs. Collibra Connect and MuleSoft integration were there before, however, now there's a move to API. Within a year or two, they will all move to API. Whoever is using it now with MuleSoft and Collibra Connect needs to find another way for connecting with the API. I don't think they are providing additional software for MuleSoft integration. Primarily, they are telling us, okay, we will decommission this and move to API. The only thing that's lacking in terms of the change is when connecting to database. Sometimes the connection causes issues if the data is breaking the firewall and ingesting the data.
The breadth of available connectors for metadata ingestion need to grow quickly to support customers as they expand their data governance programs to include a diverse list of source systems from which they want to derive business value. The connectors are needed to bring metadata into Collibra and enable lineage, workflows, definitions, etc. That said, this is not just a Collibra problem - this is an everybody problem. The central challenge is the availability of APIs to ingest text structural metadata, which is a common problem across any data governance platform or even any integration platform, honestly. To be fair, I would say that Collibra's purpose and primary value is as a collaboration platform, which is the core value of business-centric data governance, and not as an integration platform. For this purpose, they are clearly the leading solution.