What is our primary use case?
We use it as a data storage platform for several proprietary applications that we have designed and built and now support. We generally use it to be able to scale so that our customers can search a sizable amount of data. We have millions of records that include an extensive amount of text.
How has it helped my organization?
We implemented Azure Cosmos DB for our tokenization process. We had originally built a data dictionary to be able to tokenize different words within our MongoDB collection, but over a period of three years or so, we found that there were some limitations to doing that. The data dictionary needed to be updated, so we turned to the vector search feature because it essentially allows us to measure the similarity between words. Those sorts of comparisons could be done very easily. There is the ability to tokenize words, which we then use in the search functionality provided for the users of our applications. It has helped us improve the search functionality of our applications.
I landed on it as an architectural component of one of our first solutions. I went in expecting its benefits. It delivered the benefits of being able to quickly scale and being able to support semi-structured, unstructured, and structured data sets or data properties. All of these aspects are supported. We were able to realize its benefits early on.
We have used the vector database with Azure AI services. It works fine. They are embedded vectors. We are running some text through Azure AI. It then returns these embedded vectors, and we store those. We are able to use those vector values or vectors to determine the similarity between various words that are being searched in our applications.
The Azure Cosmos DB's ability to search through large amounts of data is excellent. It is fantastic. We have benefited from it. It is great.
What is most valuable?
There are a number of different APIs or data storage supported in Azure Cosmos DB. Specifically, we are using the MongoDB API, so we leverage it in that way. I like the flexibility that it offers. My team does not have to spend time building out database tables. We can get going fairly quickly with being able to read and write data into a MongoDB collection that is hosted inside Azure Cosmos DB.
I found it very easy to use. We have been using it for five years, so it is quite flexible for us. The ease of use is quite high for us.
What needs improvement?
It is easy to use, but optimization has been a mixed experience. It has been more of trying to figure out how to do so. We have not found much support there, so we have to come up with our own way of optimizing it in different ways. That is one area of improvement. We would like to have more tools that support the optimization of Azure Cosmos DB. There are not many tools out there. We have had to develop our own tools internally, such as a clean plan or query plan, and look at the index usage, throughput, and those sorts of things. The portal experience for optimizing and monitoring the service needs a few enhancements.
I am looking at it through the lens of MongoDB API. There are four or five other APIs that are supported in Azure Cosmos DB. The MongoDB API experience could be improved substantially by having a more user-friendly set of administrative tools so that I can go out there and query the documents that are part of the collection. Currently, I am working around that by using third-party tools like 3T. I also use MongoDB's Compass client tool. They can make this part of managing the database or the collection a lot easier by providing some built-in toolsets, similar to what is offered by Azure with Azure Data Studio. That is a big area for improvement.
We should also be able to better manage the cost. There have been some improvements there, but there is still room for improvement in terms of how costs are managed through the Azure portal relative to Azure Cosmos DB. To me, it is one of the more expensive services out there depending on how it is being leveraged.
One of the key limitations is that only so many vectors can be supported. It does not work very well with the large amount of text that has to be embedded in the vectors. That is one limitation we have run into with the feature set.
For how long have I used the solution?
What do I think about the stability of the solution?
The SLA is pretty good. We have been able to at least get past 99.9%. We are probably closer to 99.99%. So, overall, it has done well over the five years. Just like with most things, there were a few instances where it crashed or was not available. Those instances are memorable, but they are few and very far apart.
Its latency is good. The availability is also good.
What do I think about the scalability of the solution?
Its scalability is good. However, redundancy does not work very well. Redundancy is having a set of backups. It is also a part of high availability. We have used some of the redundancy features in Azure Cosmos DB, and it created problems for us consistently. We recently had to move away from having a redundant copy of the data and just having a single copy. Of course, we have adequate backups.
Through the lens of MongoDB API, the scalability can be better. However, the limitations are core to the actual platform. MongoDB is not designed to scale horizontally, so that is how Azure offers it. It scales vertically which means that I can go and request more compute and more memory RUs for the instance that I am using. If I was supporting multiple workloads that had different read/ write patterns, it would work, but it is not designed to do that well at its core, as I understand it. It is less of a function of Azure Cosmos DB and more of a function of MongoDB itself.
The dynamic scaling has helped decrease our organization’s overhead costs. We are able to scale up during business hours, or when there is demand, we scale automatically. There are some tools that we have built and some processes that do that. We can also scale down during non-business hours or when the demand drops for the database or the data store. It has helped to manage scaling costs.
How are customer service and support?
I contacted their support this year when we had some issues. When it comes to customer service, it always comes down to the person you get on the phone or who picks up your ticket.
Regarding Azure Cosmos DB, we felt frustrated when we needed that support from Microsoft. It has not been there because the things that we are dealing with are generally more complex than most customers would have to deal with. At times, the representatives or engineers we got or who picked up the tickets we submitted did not have the breadth of experience needed to support us or resolve the issues. So, we resolve the issues ourselves the best we can.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We use some of the alternatives because it does not solve everything. There is no such thing as one perfect data store. We use Azure SQL instances. We use SQL and VM in Azure. We have started doing a lot more Postgres, which is the flavor of the time. Everybody seems to be moving to Postgres all of a sudden.
Synapse is another tool. It is another Azure service that we use. It just depends on what type of data we are using and what makes sense in terms of the implementation of the application. Those are some of the alternatives that we have used.
How was the initial setup?
Its deployment is easy. Setting up the service is easy. You make a decision around where you want to deploy and those sorts of things. There is a lot of pointing and clicking. That was very easy.
Taking it to production was a lot harder. It was a lift to get the data loaded into Azure Cosmos DB. At the time, there were about 750,000 resumes that we uploaded for a customer, so it took a lot of time. We had to build a custom app to load those documents on data records into Azure Cosmos DB. We had two people working on it for two weeks. We probably spent somewhere around 60 hours around the lift to get that all loaded up and going in Azure Cosmos DB.
It took us about 18 months to feel fully confident in working with this and reach a level where we can go and teach others. We feel that we have got a firm grasp of the service after about 18 months of production support.
Its maintenance has been taken care of by Microsoft. However, at the end of last year going into this year, there were a few disruptions with the service that hampered our customers or users. There have been times when the service went down, or the service was upgraded but the SDK or NuGet packages used to support or connect to Azure Cosmos DB were not in sync. Overall, Microsoft takes care of the maintenance.
What about the implementation team?
It was all done in-house. We had two people involved in it, myself and my lead developer. It was mainly about loading data into Azure Cosmos DB. That was a big lift.
What was our ROI?
Azure Cosmos DB helped decrease our organization’s total cost of ownership. It is hard to provide the numbers, but managing the data store is easier for us with Azure Cosmos DB with the MongoDB API because there is no need for a DBA. We do not have a DBA on the team who is just taking care of the indexes and making sure that the database is healthy. It pretty much just runs. If we had a DBA in the team specifically for MongoDB, we would be paying about 150,000 dollars a year. We have to somewhere in the neighborhood of 50,000 dollars for the service in Azure. In terms of the total cost of ownership, it saves us about 100,000 dollars.
What's my experience with pricing, setup cost, and licensing?
Pricing, at times, is not super clear because they use the request unit (RU) model. To manage not just Azure Cosmos DB but what you are receiving for the dollars paid is not easy. It is very abstract. They could do a better job of connecting Azure Cosmos DB with the value or some variation of that.
What other advice do I have?
Overall, I would rate Azure Cosmos DB an eight out of ten.
Disclosure: PeerSpot contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor. The reviewer's company has a business relationship with this vendor other than being a customer: Partner