We developed a Product Information Management system tailored for e-commerce. This homegrown system has evolved into a Software-as-a-Service offering available on the marketplace. The primary challenge addressed by this solution was managing vast amounts of data associated with products.
In this context, major e-commerce players like Amazon handle billions of images, regularly updating metadata and product attributes. The details visible on platforms such as Amazon, Alibaba, or Walmart, encompassing product specifications like laptop configurations, processors, RAM, etc., involve handling an enormous volume of information.
For this use case, a key requirement was a reliable system with high uptime that did not adhere to the traditional master-slave or leader-follower architecture. Initially opting for Cassandra to meet these criteria, we encountered challenges related to the availability of skilled developers and Database Administrators. Consequently, we decided to transition to MongoDB to address these technical skill constraints.
The use of Cassandra in real-time data analytics has been pivotal for our e-commerce platform. As our platform operates 24/7, providing services to sellers and customers alike, the need for real-time updates is paramount.
For instance, when a customer leaves comments or feedback on an image, they anticipate an immediate reflection of these changes on the portal. Similarly, sellers altering product attributes or updating images expect instant visibility of these modifications.
Handling large data volumes with Cassandra has been an excellent experience. Despite challenges related to the influx, these were not attributed to Cassandra itself but rather to middle-layer issues. Generally, it demonstrated scalability with workloads, thanks to its horizontal scaling capabilities. We could easily add new nodes to the system as needed, ensuring the platform coped well with increasing loads.
The tool's most beneficial feature for scalability is its entire architecture. The absence of a single point of failure or a leader within the ecosystem contributes to its robust scalability. This key aspect influenced our decision to opt for the Cassandra ecosystem.
In terms of performance, it demonstrated the ability to handle approximately 1.6 billion requests per day. This was achieved on AWS using EC2 instances, and it was during a period about four to five years ago.