I designed a product catalog data model in Cassandra according to their features and properties, loading millions of data and performing the required queries over it.
Now, I am getting much better performance than relational databases.
I designed a product catalog data model in Cassandra according to their features and properties, loading millions of data and performing the required queries over it.
Now, I am getting much better performance than relational databases.
Here are some features which have been really helpful in my organisation:
HA is one of the great features of Cassandra with no downtime, e.g., you can achieve continuous data without a single downtime because of node to node ring architecture.
Maybe they can improve their performance in data fetching from a high volume of data sets.
I had to compare Cassandra with MongoDB. MongoDB is much better in data fetching than Cassandra.
This product is used as the storage facility for a very-high-throughput application, where a lot of NoSQL data is being captured and it needs to be processed.
The clustering needs to be better; it is getting there.
I have used it for four years.
Cassandra is stable thus far; the problems that I have encountered were with CQL and the JSON support in CQL.
The scalability is good, as long as you understand how to set up the nodes.
I did not have any interactions with technical support, because I was able to find answers to my questions online as I did my searches.
The other solutions that I have used have been the SQL engines but for this project, Cassandra was determined to be the better solution.
Setup was very straightforward.
Pricing and licensing depends on what you are doing: If you are using it for major production work, I recommend that you purchase the level of support that you would need.
This was the only product that was evaluated.
Learn how many nodes you are going to need and set up the right level of replication.
Our primary use case for the solution is testing.
The stability of the solution and the documentation available can be improved. The solution is limited to a linear performance, which should be improved in the next release.
We have been using the solution for approximately one year and currently use version 4.11.
I rate the stability a six out of ten.
There is no customer service and support because it is an open-source tool.
The initial setup was difficult because the was no proper guide to assist with the installation process. Therefore, I rate the initial setup process as seven out of ten.
The application is open source, so we do not pay for it.
I rate the solution a six out of ten because I haven't found any consistency in its performance, which is not aligned with what we see on the back end. The solution is good, but its documentation can be improved.
I really appreciate the high availability, automated replication, linear scalability, and automated region fail-over.
We've used Apache Cassandra for solutions that we sell to our customers. It's used as our cloud based backend store as a temporary cache and for storing data that streams through our data pipe. It's an excellent high speed store.
Out-of-the-box monitoring, troubleshooting, and maintenance are involved. There are several utilities/interfaces available for use, but one would have to educate himself and learn the intricacies of managing a Cassandra cluster.
For example, we recently hired a consulting firm to make recommendations on how to approach maintenance and the health of the cluster and we're learning from that experience.
I have used this product for approximately two years.
We have had stability issues including out of memory issues and crashes with earlier versions of the product.
We’ve not really had scalability issues, but scalability is solved by advanced tuning or adding nodes.
We use the open source version, so support is pretty much on our group of developers and public forums/user groups. For example, I'm a member of the Cassandra user list mail group.
This is a new cloud based enterprise product, so there weren't previous solutions.
The setup is not terribly complex, but a learning process was involved.
We use the open source version, so it's free. Costing needs to take into account home grown maintenance and support, as that can get involved.
We looked at Hadoop, Spark, Spark Streaming, and MongoDB.
If you plan to use the open source version, make sure you hire a Cassandra expert or train yourself in the internals of Cassandra.
Ability to achieve write speeds 10k tps: Compared to existing, it is 300% percent higher.
Row-level locking is not available; might be very helpful in update use cases.
for the past 2 years
older version of Cassandra 2.x having problem while restarting the nodes to the ring. New version from Cassandra 3.5.x onwards this issue got resolved. We are free to stop and start any nodes without any issues.
no
no
The earlier solution used Couchbase, which has leader selection. At times, when leader selection takes time, then we would lose the transactions. This got resolved with the peer-to-peer architecture solution in Cassandra.
Initial setup is straightforward. If you want to do mass cluster setup then centralized tool will be of great help.
in house implementation
Our primary use case is developing software for others and it's really a solution for enterprise size companies. We're like integrators and we have numerous technical partners that implement. We have a partnership with the company, implementing the service on projects. I'm a managing director of the company.
The most valuable feature for us is the technical evaluation, it's the best technology. Cassandra is good for us.
The interface could definitely be improved. It's a technical database and for me the features are not user friendly. I also think it's quite an expensive solution and I hope over time with more implementations, this will improve.
I've been using this solution for two years.
This solution is stable.
This is a scalable solution.
From what I know, customer support is fine.
The initial setup is a little complex and each time we use a specialist for deployment. It depends on the nature of the implementation as to how long deployment takes.
We don't use this solution like a common database. It's really for people using big data, BI and other analytic software. You need to have the right use case to take this product.
I would rate this solution a nine out of 10.
What progress have you seen in the clustering so far? What progress would you like to see in the future?