Dear Dr. Jay,
We want to assess the importance of fixing some of our customer touchpoints, what would you recommend as a modeling tool?
There are a variety of tools we use to determine the relative importance of key variables on an outcome (dependent variable). Here’s the first question we need to address: are we trying to predict the actual value of the dependent variable or just assess the importance of any given independent variable in the equation? Most of the time, the goal is the latter.
Once we know the primary objective, there are three key criteria we need to address. The first is the amount of multicollinearity in our data. The more independent variables we have, the bigger problem this presents. The second is the stability in the model over time. In tracking studies, we want to believe that the differences between waves are due to actual differences in the market and not artifacts of the algorithm used to compute the importance scores. Finally, we need to understand the impact of sample size on the models.
How big a sample do you need? Typically, in consumer research, we see results stabilize with n=200. Some tools will do a better job with smaller samples than others. You should also consider the number of parameters you are trying to model. A grad school rule of thumb is that you need 4 observations for each parameter in the model, so if you have 25 independent variables, you’d need at least 100 respondents in your sample.
There are several tools to consider using to estimate relative importance: Bivariate Correlations, OLS, Shapley Value Regression (or Kruskal’s Relative Importance), TreeNet, and Bayesian Networks are all options. All of these tools will let you understand the relative importance of the independent variables in predicting your key measure. One think to note is that none of the tools specifically model causation. You would need some sort of experimental design to address that issue. Let’s break down the advantages and disadvantages of each.
Bivariate Correlations (measures the strength of the relationship between two variables)
- Advantages: Works with small samples. Relatively stable wave to wave. Easy to execute. Ignores multicollinearity.
- Disadvantages: Only estimates the impact of one attribute at a time. Ignores any possible interactions. Doesn’t provide an “importance” score, but a “strength of relationship” value. Assumes a linear relationship among the attributes.
Ordinary Least Squares regression (OLS) (method for estimating the unknown parameters in a linear regression model)
- Advantages: Easy to execute. Provides an equation to predict the change in the dependent variable based on changes in the independent variable (predictive analytics).
- Disadvantages: Highly susceptible to multicollinearity, causing changes in key drivers in tracking studies. If the goal is a predictive model, this isn’t a serious problem. If your goal is to prioritize areas of improvement, this is a challenge. Assumes a linear relationship among the attributes.
Shapley Value Regression or Kruskal’s Relative Importance
These are a couple of approaches that consider all possible combinations of explanatory variables. Unlike traditional regression tools, these techniques are not used for forecasting. In OLS, we predict the change in overall satisfaction for any given change in the independent variables. These tools are used to determine how much better the model is if we include any specific independent variable versus models that do not include that measure. The conclusions we draw from these models refer to the usefulness of including any measure in the model and not its specific impact on improving measures like overall satisfaction.
- Advantages: Works with smaller samples. Does a better job of dealing with multicollinearity. Very stable in predicting the impact of attributes between waves.
- Disadvantages: Ignores interactions. Assumes a linear relationship among the attributes.
TreeNet (a tree-based data mining tool)
- Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes.
- Disadvantages: Requires a larger sample size—usually n=200 or more.
Bayesian Networks (a graphical representation of the joint probabilities among key measures)
- Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes. Works with smaller samples. While a typical Bayes Net does not provide a system of equations, it is possible to simulate changes in the dependent variable based on changes to the independent variables.
- Disadvantages: Can be more time-consuming and difficult to execute than the others listed here.
Dr. Jay Weiner is CMB’s senior methodologist and VP of Advanced Analytics. Jay earned his Ph.D. in Marketing/Research from the University of Texas at Arlington and regularly publishes and presents on topics, including conjoint, choice, and pricing.