Our primary use case is analytics.
We are putting less than 10 machine learning models into production, and do not currently run our models on a cloud environment.
Our primary use case is analytics.
We are putting less than 10 machine learning models into production, and do not currently run our models on a cloud environment.
It minimizes coding.
Our go live process has been slightly enhanced compared to the previous programmatic process. There is now a faster time to production from the business end. We have C&DS, so we are able to drop the model streams in C&DS, then deploy it through there.
The visual modeling capability is one of its attractive features.
The biggest issue with the visual modeling capability is that we can't extract the SQL code under the hood. We have a lot of non-technical analysts that develop streams, then when we want to translate it to native SQL, we can't extract it without opening up each node.
We would like to see better visualizations and easier integration with Cognos Analytics for reporting.
It is not consistently stable. I hope they plan on improving it. C&DS is not stable at all.
SPSS Modeler should meet our needs going forward. It is very scalable for non-technical people. The challenge for the very technical data scientists: It is constraining for them.
C&DS will not meet our scalability needs.
I would not rate the technical support very well. The technicians have accents. When you do find someone, it is very hard to get somebody able to answer the technical questions.
It is very easy to set up. Once we deployed it and got the license code registered, it was fine.
We looked into SnapLogic, SaaS, and open source. We chose SPSS Modeler because of the drag and drop capabilities and most of our business analysts are non-technical, so this was attractive to them.
We use it for data modeling like arithmetic modeling, bank modeling. We have different models such as loan models. We use three products, SAS, R, and SPSS Modeler to do predictive modeling. We are a big IBM shop.
I'm not sure how many machine-learning models we are putting into production. I'm new, I've been at the company for five months, but I would say this year there should be at least five or six models. We do a PoC on modeling and, based on what fits better, that's what we go with. So the bottom line is that a handful of models will go live but we'll be trying 10 to 15 models to do the predictions and see what best suits the company.
This is batch. We do monthly modeling, we do weekly modeling. It's not daily. We run weekly model reports too. We also change the parameters that we enter based on the industry, as things change.
We don't have cloud, it's all on-prem.
Our go-live process has changed compared to the previously programmatic code based process. It’s not just the time to go-live but it’s also the process itself; the improvement in terms of performance, and maintenance is also important. I would say it has saved us a lot of time, about 20 or 30% of our time. I don’t have the numbers in front of me but I think something along those lines.
We are big-time into data analytics. AI is another area which we want to start looking at. Digital banking is important. We are looking more into digital banking and we are trying to put some features in there. I think the trend is more on that area of data analytics, digital.
I can't comment on our use of SPSS Modeler for governance and security issues.
We use analytics with the visual modeling capability to leverage productivity improvements.
New features are always welcome, but I’m not the core person. A separate team can comment on this, but not me.
There are issues, we try to mitigate them. There are always issues. We’re trying to be stable but there are a few areas...
It’s definitely scalable, it’s all on the same platform, it’s well integrated. I think the integration is important in terms of scalablility because essentially, having the entire suite helps a lot to scale it, market it. Even in terms of processing, it’s easier.
I personally have not had experience with IBM technical support, but the group has worked with them. I haven't heard anything from them, so I think it's okay.
We already had SAS, we had R. It’s all legacy and it’s all homegrown. But we had an IBM shop also.
I would say, look through every product in the market, like we do, and try to pick what works best.
I use it for my classes. One of the classes I teach is Advanced Analytics for students in the actuarial sciences area. My students are also using it for projects that they have to do as part of the process leading toward their degrees.
Before that, I was using it when I worked for IBM, as a consultant. I was doing a project for IBM in their analytics.
The main benefit is it makes things a little easier to do. If you want to solve a problem with R, for example, that's a lot more of a struggle. Essentially, R is a programming language. This package makes it more user-friendly, particularly for people who do not have a background in programming.
It's very easy to use. The drag and drop feature makes it very easy when you are building and testing streams. That's very useful.
I understand that it takes some time to incorporate some of the new algorithms that have come out in the last few months, in the literature. For example, there is an algorithm based on how ants search for food. And there are some algorithms that have now been developed to complement rules. So that's one of the things that we need to have incorporated into it.
Stability is very good.
I have no issues with scalability. It's pretty scalable. It makes pretty good use of memory. There are algorithms take a long time to run in R, and somehow they run more efficiently in Modeler.
One of the things that I have not done in Modeler, and I'm not sure if the capability is there, is to run things in parallel. I'm pretty sure they have it but I haven't used it.
In a certain sense, I was involved in the initial setup. When I joined the university, I started to try to develop a joint agreement between IBM and the university. Because even two or three years ago, IBM was very reluctant to have universities use Modeler at no cost to the faculty or students. Now, fortunately, that has changed. Now our students can have a six-month license. That is very good. I was pushing for that when I was at IBM and then finished pushing for it when I joined the university.
Weigh the pros and cons. A lot of companies do not want to go with SPSS Modeler because of cost. What I have told some of my customers - I do some consulting as part of my job at the university - is, don't look just at the dollars and cents, look at benefits in your use case.
In terms of selecting a vendor, the most important thing to me is the availability of support.
Maybe I'm biased because I used it for a long time at IBM, but I would give it a 10 out of 10.
Building predictive models, including customer churn and lead generation.
Performance has been great. I've used it for about eight years or so, lots of flexibility. It continues to be a very flexible platform, so that it handles R and Python and other types of technology. It seems to be growing with additional open-source movement out there on different platforms.
We aren't putting that many machine-learning models into production. This is not the primary tool we use. This is more for me in terms of data exploration and knowledge discovery, that kind of thing. I really haven't done any production models in my current role. In previous roles I have.
In terms of cloud environments, it's actually a combination. Long story, but it's a combination of different things.
It's more for data, as a data repository.
My experience so far using Modeler is good. I haven't noticed any issues with our current solution.
I don't use it for governance and security issues or for visual modeling. For data visualization we use ThoughtSpot, Tableau, Power BI. In terms of the graphic capability, those are existing platforms that have a larger user base, so it's unlikely that we'll use Modeler exclusively for data visualization.
I really can't think of anything off the top of my head because I feel like I'm under utilizing it as it is, because we're doing specific things. Two or three years ago, I would've said R and Python integration, but they've done that.
I've been using it since it was called Clementine. Every version seems to be better than the previous, but I don't think I've ever had any catastrophic failures, or any bugs that were significant enough to not have a work-around available.
The scalability was kind of limited by our ability to get other people licenses, and that was usually more of a financial constraint. It's expensive, but it's a good tool.
I haven't used tech support recently. We used IBM designates for things like training and the like, which has always been very good, but I can't really think of any issue that required any technical support.
It's a solution that was available when I entered the role. I have heard from others who were in the process of trying to start from ground-zero, and the tendency for them is to go with open-source because of the revenue model, obviously.
I would say, if you're considering that open source-solution, definitely consider Modeler as well. Put together some kind of proposal that allows you to figure out how much time it's going to take individual people to create those models, versus being able to have an out-of-the-box solution that gets your team going more immediately.
Support is another benefit of going with Modeler over open-source. SPSS has been around for a long time. IBM acquired them, and they've added functionality and features to meet the needs of growing data science populations.
The primary use case is to augment our sales processes, to help our call center determine which customers to call, which products to push to those customers.
Thus far it's been pretty effective. In a recent sample that I pulled, it successfully predicted two-thirds of our sales in a given week.
We're running batch, overnight, and I believe we have three machine-learning models in production at the moment.
We have separate models for our US call center and our UK call center. Each one is designed to do a customer recommendation, where it determines which customers should be ready to buy today, based on the recency of their last purchase, how frequently they purchase. And then it scores the opportunity with that customer, based on how much money they spend with us. It gives the salesmen a ranking of which customers are their biggest opportunity on that day, and they just go down that list and call them. It generates pretty good sales.
And then we have a second model that does item recommendations, based on some association modeling. The association model tells the sales rep what product that customer should be buying, based on their sales purchase history.
We're on-prem. I find the on-prem to be a pretty seamless experience, it flows directly from our data warehouse into the Analytics Server, and then we're able to deploy it back to the data warehouse for deployment into our CRM system.
The benefits are that this product makes us a more efficient sales staff. We're reducing the inefficiencies in the buying patterns of our customers, by calling them when we know they're ready to order, instead of waiting for them to call us. It makes us more effective in our calling practices as well. We're not just cold-calling anymore, we're actually calling customers we know are ready to buy.
In terms of our go-live process changing, I believe we're following some pretty standard practices there. I don't think we've changed too much, other than which servers we were using as production servers.
I think the ease of use in the user interface is the best part of it. The ability to customize some of my streams with R and Python has been very useful to me, I've automated a few things with that.
We don't use SPSS Modeler for governance or security issues.
Regarding visual modeling, it is not the biggest strength of the product, although from what I hear in the latest release it's going to be a lot stronger. I'm excited to see what they have coming down the line, because I know that's an area they've focused on the most recent release, and I'm not on the recent release yet. I haven't really been able to leverage it to make any productivity improvements with our data science or analytic teams. Most of my visualization gets done through Cognos.
Like I said, I'm really excited about the enhanced visualization that I know is coming down the pipeline. That, to me, has always been the biggest flaw in using this. It's very difficult to get good visualization.
I think mapping for geographic data would also be a really great thing to be able to use.
Also, I think it could be marketed better, actually. I think there's a lot of confusion among customers about whether they should be using SPSS Modeler, or DSX. And even some of the partners I've spoken to about it, they've given me some conflicting opinions on which one I should be using at my level of scale.
I haven't had many issues with stability. The only stability concern I ever had was just certain credentials, if the job failed multiple times it deactivated the credentials, and then became a whole process with IT to get the credentials reactivated to get the stream running again.
Scalability is infinite, because it can just spit out straight to our enterprise data warehouse, and we can use that to deploy anywhere.
I haven't needed technical support. The product works pretty well.
I came to the World of Watson Conference in 2015, and when I saw SPSS Modeler and what it could do, I just sampled it, and it really, to me, spoke volumes about some of the inefficiencies in the way we were doing business. And, as a brand new BI practice at a company that never had one before, I was just trying to build my practice from the ground up, and I didn't want to limit it to just BI reporting, so I took on the challenge of bringing in this new software, and staking my reputation on it, and it's paying off.
The reasons we eventually chose this solution were that we were made a very good deal on the Gold package, which gave us more capability. I think without Collaboration and Deployment Services it wouldn't have been a worthwhile investment for us and it would have failed on the deployment. So that deal we got on the Gold package really sealed the deal for us.
What's most important when selecting a vendor is the proven practice of the product. Knowing that the product has had success for numerous other customers in the past for similar use cases, for similar types of customers. I think knowing that there are a variety of partners out there with expertise in the product is a very strong selling point for me. I don't like going to things where I can't get help, if I get stuck.
It was a little complex, but the person we work with, Chris Thomas, did a fantastic job walking us through it.
There were just a lot of steps and components to it. We bought the Modeler Gold package, so we had to consider CNDS, we had to consider ADM - we had a whole bunch of different components that had to get set up simultaneously. And when upgrading, we have to upgrade all of those components simultaneously in order to keep using it.
We ended up working directly with an IBM partner, but we also worked with Revelwood and LPA.
I'd give it a nine out of 10. I really think that for someone who is not the strongest programmer on the planet, but is trying to learn and trying to put together some of these basic data science projects, it's a really valuable tool, the UI is very user friendly. So, it definitely launched my journey into becoming a data scientist, and three years later I'm becoming a lot stronger with it.
In terms of advice, the right partner can make all the difference. You need somebody who you can bounce questions off of when you get stuck, because you're going to get stuck, it's just inevitable. If you haven't implemented data science and predictive modeling before, you're always going to hit a challenge that is unique to your data, or to your process, and you need somebody who can lend the weight of experience to just talk you through it.
People data, survey insights, HR analytics, nominal data, relational data, SEM modeling, logistic regression using nominal or ordinal groups.
Quickness and ease of use with the guarantee of robust modeling techniques and trustworthy accuracy.
Quick insights.
Easier coding language that is more flexible with other platforms. More server capabilities. More graphics.
Customer segmentation and churn analytics.
We get best results in customer segmentation and churn analytics and we have retained our customers. Our retention score has improved as a result of these projects.
We haven't used machine learning solutions yet.
Our business units' capabilities with SPSS Modeler is high. They no longer waste time on modeling and algorithms, meaning they are not coding any more. For example, segmentation projects now take one to three months, rather than six months to a year, as before.
In the future, SPSS and Cognos Analytics will be integrated. We will be using the two products together.
We have not yet used IBM SPSS Modeler for governance and security issues.
It would be helpful if SPSS supported open-source features, for example, embedding R or Python scripts in SPSS Modeler. We don't need that now, but in the future it may be useful.
We haven't suffered from any stability issues. It's a stable product.
We haven't had any performance problems. The product runs every data volume performantly and produces results.
We are doing our solutions in-house, but sometimes we require local support from IBM partners, but not too often. We are happy with the support the partners provide.
We have SPSS know-how in our company, and other products are not as stable as SPSS. Also, we have local support in Turkey.
Straightforward. It was not complex.
Oracle and SAP. SPSS, however, is widely known and widely used in Turkey. University students learn it, so it's easy to find professionals to work with it.
You should analyze your needs and your data, your projects. There is a lot of choice in data analytics. Which one is best depends on your needs and your budget. It depends on what you are looking to achieve.
IBM SPSS has been supporting in-database analytic modeling for a while now. Their objective is to make it possible for analysts to run the complete data mining process end-to-end in-database – from accessing the data to data transformation and model building/scoring. In particular, they try to enable analysts to push data transformation and data preparation into the database as these are typically a big part of data mining projects. To achieve in-database execution they provide three main features – SQL Pushback, direct access to a database’s own analytic modeling routines and model deployment/scoring options.
To build a predictive analytic model in IBM SPSS Modeler, an analyst creates an analytic workflow. This consists of multiple tasks or nodes to read, merge or transform data; split data into different test sets; apply modeling algorithms and more. SQL Pushback takes the nodes in this workflow that relate to data access and transformation and pushes them to the database. The tool generates the SQL you need for these steps and executes that SQL on the database from which you sourced the data. This SQL is specific to the database concerned for the main supported databases (IBM DB2, Microsoft SQL Server, Netezza, Oracle, Teradata) and generic SQL is available for many nodes for other databases.
IBM SPSS Modeler also reorders work streams to maximize the effectiveness of this SQL, particularly in terms of keeping the data in the database. For instance if multiple nodes that can be executed in-database are separated by one that cannot be then the nodes will be re-ordered to group the in-database nodes where this is possible.
When In-database execution is possible for a node in the workflow it is color-coded purple to show this – modelers with strong database servers will try and turn “all the nodes purple” so that everything is being done in database. Some customers write raw SQL to use more extended functions like statistics functions that would not automatically be pushed back. SQL Pushback can be turned off so that high load production environments don’t get slowed by in-database analytics and users can decide to cache intermediate results in a database table simply by selecting a node and asking for caching.
The second element of in-database analytic modeling is to build the model itself in-database. For this IBM SPSS Modeler Building use the analytic routines in Oracle (the ODM algorithms), Microsoft SQL Server, DB2 and InfoSphere Warehouse as well as (since the 14.2 release in June) Netezza. These in-database algorithms are presented as new node types in the workflow, allowing a modeler to simply select them as part of their usual workflow. In addition IBM SPSS Modeler has its own algorithms that can be used on the modeling server. These in-database algorithms allow data in the database to be scored and some allow the model to be calculated live when the record with which it is associated is retrieved. These in-database algorithms are typically parallelized by the database vendor and IBM SPSS Modeler inherently takes advantage of this.
IBM SPSS Modeler supports a number of other deployment options besides the use of these in-database routines. A number of the standard IBM SPSS routines can generate SQL for scoring in-database – the model is built outside the database but the SQL allows the scoring to be done in-database once the model is built. Several of these routines support parallel execution on the modeling server. Models, no matter how they were built, can also be deployed using Scoring Services and made available using a web services interface for live scoring. Models can also be deployed using IBM SPSS Decision Management.
Don’t forget the Decision Management Technology Map