What is our primary use case?
I use IBM Spectrum Scale to improve performance in data processing tasks. I have employed it in various projects, including a major project at Leading German Bank focused on the improvement of the FDW (Finance Data Warehouse). The primary goals were to reduce elapsed time and costs for regulatory reporting.
What is most valuable?
I find IBM Spectrum Scale to be an excellent product known for its fast parallel file system. It achieved the best results when integrated with IBM hardware. Even though it is complex, it provided significant performance advantages in large-scale data management. Its fault tolerance mechanisms and integration capabilities make it popular for extensive AI initiatives and data processing tasks for organizations like Daimler Benz and Bosch.
What needs improvement?
There can be improvements in fault tolerance and making erasure coding faster.
Additionally, it would be beneficial to have example configurations or applications that assist in setup and configuration.
For how long have I used the solution?
I have used it for nearly four years in various projects.
What do I think about the stability of the solution?
I find it stable, however, erasure coding can slow down the file system.
What do I think about the scalability of the solution?
It scales well, however, it requires significant resources to avoid slowdowns from erasure coding.
How are customer service and support?
I received excellent technical support. Customer service in my projects, particularly with the Leading German Bank project, was excellent.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, I evaluated and used Hadoop with Leading German Bank from 2016 to 2019, along with other products like Apache Spark.
However, I found IBM Spectrum Scale gives better performance results.
How was the initial setup?
The initial setup is complex, especially if trying to avoid erasure coding, as it requires more discs. Avoiding erasure coding can significantly increase costs.
What's my experience with pricing, setup cost, and licensing?
IBM Spectrum Scale is very expensive with complex pricing models usually based on the amount of storage used or the number of servers. Using Excelero for mirroring storage helped reduce Spectrum Scale licensing costs by about half during the Leading German Bank project.
Which other solutions did I evaluate?
I evaluated Weka and Apache Spark, however, I found IBM Spectrum Scale to offer better performance for large-scale installations.
What other advice do I have?
Although I used to be a big fan of Weka, due to recent limitations and lack of product improvements, I prefer to recommend IBM Spectrum Scale over others.
I rate this product ten out of ten.
Which deployment model are you using for this solution?
On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.