I have wide experience with SPSS Statistics. I have even written books about it.
Statistics is part of analysis, but SPSS Statistics has some differences from just straight analysis. It is also a modeler, has some automation, and can build models like linear regression. It is completely different when you use a modeler for your analysis. A modeler is designed specifically for building models of data in an automated way and using the models for prediction, and forecasting. It can help you see the data and the meaning in different ways.
Because I am a statistician, originally I am trained for both modeling and non-parametric analysis. Most modeling tools are completely different. They have no non-parametric analysis because they are used for large sample data or big data analysis.
The primary use for SPSS Statistics for me is, of course, data analysis.
IBM SPSS Modeler is a set of data mining tools that enable us to develop predictive models. It has both statistical techniques (like regression techniques) or non statistical techniques (like Classification trees , Neural networks etc.). It doesn’t contain mainly the Nonparametric tests, may be because it depends on small samples.
It has automation nodes that helps non-experts to pick the best technique.
On the other hand, IBM SPSS Statistics is used mainly to analyze data, weather it is a small or large sample. If it is small (and drawn from non-normal population), then we must use the non-parametric tests. It has “automated” non-parametric tests to be use by both experts and non-experts.
Beside its main use for data analysis, it has also some modeling techniques like automatic linear modeling, Classification trees and Neural networks.
This product provides the opportunity to better explore data and provide complete data analysis because it works both with modeling and non-parametric analysis.
I like SPSS too much to want to make changes to the product in a major way. There are some minor things. The one thing I can think of that will be useful in a broader sense is that it may be nice to be able to add color to some of the data in sheets or reports. Sometimes it is easier for you to visualize results if you can use color inside the datasheets.
The original concern or intent of products like this is to analyze data, not to manipulate the data. But there are times when you need to manipulate data. This is something that would help more in data entry because there should be ways to fix errors in manual entry, et cetera. Addition of some data manipulation capabilities may be good for some things, but often you would do that outside of SPSS as it is not the purpose of the tool.
SPSS Statistics will be more efficient if it starts to use additional automation. The advantage of automation in SPSS automation is obvious and exemplified in regards to the modeler, the non-parametrics, and the linear regression. But if they can add more automation in other areas, I think it will be more powerful. For example, the data science can have additional automation, for serious data science or something like EMA (Ecological Momentary Assessments). This would be a very interesting addition to SPSS Statistics.
They can also add some automation with regression itself. Like logistic regression and alternate regression. Some people do not know the difference between binary, logistic, and the multinomial logistic analysis. It is an advantage in the SPSS Modeler. It is automatic. It can detect what type of modeling to choose. If a part of the variable is binary, it will use binary logistics. If it is appropriate it will use multiple logistic regression. It can detect this and do it automatically.
So, if SPSS Statistics starts to use some additional automation, it will be a step between the statistics and more serious data science. Those powers would be remarkable. Data science currently is the future in my estimation. More of the world of business and finance and other areas are starting to think about the applications of data science. And data science needs some data engineering, which is possible to realize through automation. This is especially true in some things like data tracing.
If they take those steps to add some automation to the SPSS Statistics, it will push more people to understand the importance of analytics and advanced data science.
But if it can achieve automation, I believe that SPSS Statistics has to do more within the application and the company to stress the importance of learning statistics before learning SPSS itself. If SPSS can add in its development or integration some short notes about how to use statistics, that would help users. It is not enough to be able to push buttons to get an analysis. This would give you a result that you will not understnd.
SPSS could include an advantage of statistical analysis that goes beyond just the analysis itself. For instance, if you pick to do a logistic regression, you would find a short note about what logistical evaluation is and then how to use it within SPSS. It should not be integrated with the menu itself. It should be inside the statistical analysis area. Most of the people using the SPSS Statistic know which modeling they should use. But it is not always the case. It could be more of a learning tool as well as a productive one.
A few years ago, SPSS made something that could be useful. It is called the Statistical Coach. It asks you some questions, and based on your answer, the coach guides you as to what to use. It was a good statistical coach. But if they can summarize the most important tests, I believe it is not too hard to make something that would be very useful inside the application and modeler itself.
If you have the experience with data science, you will know how to do what you want to do using different tools — even competing products with SPSS. I can understand if somebody has to ask me how to do two-way ANOVA in SPSS, for example, as it is not an obvious thing. I will guide him through that if they needed to do it. SPSS can clarify how to do things in their application especially analysis that is common usage. In a way, the tool itself could provide more training for the people using it.
I have been working with IBM SPSS (Statistical Package for the Social Sciences) since its name was Clementine. I think that was version 11. Now it is version 18. I think I have been working with it for eight years approximately.
I have not used the IBM technical support for SPSS Statistics directly. I am working with SPSS since version eight back in the eighties or early nineties. So it is now version 27 and I have been working with the product for 20 versions. I have become very familiar with this product and all its changes over that period of time. Because of that, it is not necessary for me to contact the customer support.
I believe that the initial setup is simple. The problem of SPSS Statistics, it is not to click the icon to do the installation. The problem of Statistics is to know which statistical analysis you have to use.
Some people used to say that they know how to use SPSS Statistics, but they do not know which type of analysis to use. The manufacturer of the tool may want to do something to include training in the initial setup so that the users know more about the tool that they are using.
In comparing the price of other products, SPSS Statistics is too expensive. Even when most of the universities in the Middle East have licenses for SPSS Statistics, they do not have licenses for the Modeler because of its price. This reduces the utility of the product.
I have had the opportunity to evaluate some tools. The differences are not always obvious in casual use.
For example, there are some little differences between SPSS and Minitab. If you want to make something called two-way ANOVA (Analysis of Variance), you can find it directly, and clearly in the Minitab interface. It is right in the menu and you can pick two-way ANOVA. But if you are thinking of two-way ANOVA using SPSS, you will not find it. You will find only one-way ANOVA. To make two-way ANOVA, you have to go to the general linear model, pick the univariate to use to obtain the two-way ANOVA. It is there and done a different way, it is just not called a two-way ANOVA. It is known as general linear model. The only problem is that the type of analysis is not known as two-way ANOVA in SPSS.
So the difference is not that this type of analysis is impossible in one product and not the other.
On a scale from one to ten where one is the worst and ten is the best, I would rate IBM SPSS Statistics between eight and nine. If they add some automation with time series and regression analysis exactly as they did with the linear regression, it would be a much better tool. If you open the regression modeling, the first item you will find is an automatic linear model. It uses the best technique, which is either is forward analysis or is stepwise. It can anticipate the user's needs and train at the same time. Some additional capabilities of this sort will push the tool forward as well as empowering the users.