WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

Does your metric have a home(plate)?

Posted by Youme Yai

Thu, Sep 28, 2017

baseball.jpeg

Last month I attended a Red Sox/Yankees matchup at Fenway Park. By the seventh inning, the Sox had already cycled through seven pitchers. Fans were starting to lose patience and one guy even jumped on the field for entertainment. While others were losing interest, I stayed engaged in the game—not because of the action that was (not) unfolding, but because of the game statistics.

Statistics have been at the heart of baseball for as long as the sport’s been around. Few other sports track individual and team stats with such precision and detail (I suggest reading Michael Lewis’ Moneyball if you haven’t already). As a spectator, you know exactly what’s happening at all times, and this is one of my favorite things about baseball. As much as I enjoy watching the hits, runs, steals, strikes, etc., unfold on the field, it’s equally fun to watch those plays translate into statistics—witnessing the rise and fall of individual players and teams.

Traditionally batting average (# of hits divided by number of at bats) and earned run average (# of earned runs allowed by a pitcher per nine innings) have dominated the statistical world of baseball, but there are SO many others recorded. There’s RBI (runs batted in), OPS (on-base plus slugging), ISO (isolated power: raw power of a hitter by counting only extra-base hits and type of hit), FIP (fielding independent pitching: similar to ERA but focuses solely on pitching, and removes results on balls hit into field of play), and even xFIP (expected fielding independent pitching; or in layman’s term: how a pitcher performs independent of how his teammates perform once the ball is in play, but also accounting for home runs given up vs. home run average in league). And that's just the tip of the iceberg. 

With all this data, sabermetrics can yield some unwieldy metrics that have little applicability or predictive power. And sometimes we see this happen in market research. There are times when we are asked to collect hard-to-justify variables in our studies. While it seems sensible to gather as much information as possible, there’s such a thing as “too much” where it starts to dilute the goal and clarity of the project.  

So, I’ll take off my baseball cap and put on my researcher’s hat for this: as you develop your questionnaire, evaluate whether a metric is a “nice to have” or a “need to have.” Here are some things to keep in mind as you evaluate your metrics:

  1. Determine the overall business objective: What is the business question I am looking to answer based on this research? Keep reminding yourself of this objective.
  2. Identify the hypothesis (or hypotheses) that make up the objective: What are the preconceived notions that will lead to an informed business decision?
  3. Establish the pieces of information to prove or disprove the hypothesis: What data do I need to verify the assumption, or invalidate it?
  4. Assess if your metrics align to the information necessary to prove or disprove one or more of your identified hypotheses.

If your metric doesn’t have a home (plate) in one of the hypotheses, then discard it or turn it into one that does. Following this list can make the difference in accumulating a lot of data that produces no actionable results, or one that meets your initial business goal.

Combing through unnecessary data points is cumbersome and costly, so be judicious with your red pen in striking out useless questions. Don’t get bogged down with information if it isn’t directly helping achieve your business goal. Here at CMB, we partner with clients to minimize this effect and help meet study objectives starting well before the data collection stage.

Youme Yai is a Project Manager at CMB who believes a summer evening at the ballpark is second to none.

 

Topics: advanced analytics, data collection, predictive analytics

Sugar Overload: Dashboards that Yield Insights Not Headaches

Posted by Blair Bailey

Thu, Jun 29, 2017

simple froyo.png

Back in the old days (2002?) if you wanted a frozen treat—you ordered from the nice person at the TCBY counter, paid your money, and went on your way. Then Red Mango came to town and it was a game-changer. Now instead of someone else building my treat, I had total control—if I wanted to mix mango and coffee and throw some gummy bears on top I couldI didn’t though, I’m not a monster.

Of course, there was a downside—sometimes I’d walk away with a $15 froyo. Sometimes, there is such a thing as “too much of a good thing”. As a data manager, knee-deep in interactive data viz, I know this applies to dashboards as well as dessert. 

When starting a dashboard from scratch, there’s the same potential to go overboard, but for different reasons. Like flavors and toppings, there are many viewer design and build directions I could take. Will the dashboard be one centralized page or across multiple pages? What types of charts and tables should I use? What cuts should be columns and which should be filters?

The popular platform, Tableau, has so many options that it can often feel overwhelming. And aside from design, Tableau lets users deep dive into data like never before. With so many build options and data mining capabilities at our fingertips, what’s a designer to do?

Forget the gumdrops and jalapeño flavored yogurt—I encourage our clients to go back to basics and ask:

Who is the dashboard for? The content and design of a well-made dashboard should depend on its purpose and end user. The dashboards I create in my spare time (yes, it’s also a hobby !) are very different than the ones I build for clients.  For example, a deep-in-the-weeds analyst will need (and appreciate) very different functionality and design than a C-suite level user would. An analyst interested in deep-dives may need multiple filters and complex tables to cut the data every which way and investigate multiple scenarios, whereas a c-suite level needs a dashboard that answers their questions quickly and directly so they can move forward with business decisions.

It may be tempting to add flashy charts and lots of filters, but is it necessary? Will adding features help answer key business questions and empower the end user, or will it overwhelm and confuse them?

Here's a snippet from a dashboard that an executive could glean a good amount of insight from without feeling overwhelmed:

AffinID sample_simple.jpg

What will they use it for? Depending on what business questions the client is trying to answer, the design around specific types of dashboards may vary. For example, a brand health tracker dashboard could be a simple set of trending line charts and callouts for KPIs. But it’s rare that we only want to monitor brand health. Maybe the client is also interested in reaching a particular audience. So as the designer, I'll consider building the audiences in as a filter. Perhaps they want to expand into a new market. Divide your line charts by region and track performance across markets. Or maybe they need to track several measures over time across multiple brands, so rather than clog up the dashboard with lots of charts or tabs, you could use parameters to allow the user to toggle the main metric shown.

When in doubt, ask. When I plan to build and ultimately publish a dashboard to Tableau Public, I consider what elements will keep the user engaged and interested. If I’m not sure of the answers I force politely ask my friends, family, or co-workers to test out my dashboards and provide honest feedback. If my dashboard is confusing, boring, too simple, too convoluted, awesome, or just lame, I want to know. The same goes for client-facing dashboards.

As a data manger, my goal is to create engaging, useful data visualizations. But without considering who my end user is and their goal, this is nearly impossible. Tableau can build Pareto charts, heat maps, and filters, but if it doesn’t help answer key business questions in an intuitive and useful way, then what’s the point of having the data viz?

Just because you can mix mango and coffee together (and even add those gummy bears on top), doesn’t mean you should. Like TCBY and Red Mango with their flavors and toppings, Tableau offers infinite data viz possibilities—the key is to use the right ingredients so you aren’t left with a stomachache (or a headache).

Blair Bailey is a Data Manager at CMB with a focus on building engaging dashboards to inform key business decisions and empower stakeholders. Her personal dashboards? Less so.

Topics: advanced analytics, integrated data, data visualization

Dear Dr. Jay: How To Predict Customer Turnover When Transactions are Anonymous

Posted by Dr. Jay Weiner

Wed, Apr 26, 2017

Dear Dr. Jay:

What's the best way to estimate customer turnover for a service business whose customer transactions are usually anonymous?

-Ian S.


Dear Ian,

You have posed an interesting question.  My first response was, “you can’t”. But as I think about it some more, you might already have some data in-house that could be helpful in addressing the issue.DRJAY-9-2 (1).png

It appears you are in the mass transit industry. Most transit companies offer single ride fares and monthly passes while companies in college towns often offer semester-long passes. Since oftentimes the passes (monthly, semester, etc.) are sold at a discounted rate, we might conclude that all the single fare revenues are turnover transactions.

This assumption is a small leap of faith as I’m sure some folks just pay the single fare price and ride regularly. Let’s consider my boss. He travels a fair amount and even with the discounted monthly pass, it’s often cheaper for him to pay the single ride fare. Me, I like the convenience of not having to make sure I have the correct fare in my pocket so I just pay the monthly rate, even if I don’t use it every day. We both might be candidates for weekly pass sales if we planned for those weeks when we know we’d be commuting every day versus working from home or traveling. I suspect the only way to get at that dimension would be to conduct some primary research to determine the frequency of ridership and how folks pay.

For your student passes, you probably have enough historic data in-house to compare your average semester pass sales to the population of students using them and can figure out if you see turnover in those sales. That leaves you needing to estimate the turnover on your monthly pass sales.

You also may have corporate sales that you could look at. For example, here at CMB, employees can purchase their monthly transit passes through our human resources department. Each month our cards are automatically updated so that we don’t have to worry about renewing it every few weeks.  I suspect if we analyzed the monthly sales from our transit system (MTBA) to CMB, we could determine the turnover rate.

As you can see, you could already have valuable data in-house that can help shed light on customer turnover. I’m happy to look at any information you have and let you know what options you might have in trying to answer your question.

Dr. Jay is CMB’s Chief Methodologist and VP of Advanced Analytics and holds a Zone 3 monthly pass to the MTBA.  If it wasn’t for the engineer, he wouldn’t make it to South Station every morning.

Keep those questions coming! Ask Dr. Jay directly at DearDrJay@cmbinfo.com or submit your question anonymously by clicking below:

Ask Dr. Jay!

Topics: advanced analytics, data collection, Dear Dr. Jay

A Year in Review: Our Favorite Blogs from 2016

Posted by Savannah House

Thu, Dec 29, 2016

pexels-photo (2).jpg

What a year 2016 was.

In a year characterized by disruption, one constant is how we approach our blog: each CMBer contributes at least one post per year. And while asking each employee to write may seem cumbersome, it’s our way of ensuring that we provide you with a variety of perspectives, experiences, and insights into the ever-evolving world of market research, analytics, and consulting.

Before the clock strikes midnight and we bid adieu to this year, let’s take a moment to reflect on some favorite blogs we published over the last twelve months:

    1. When you think of a Porsche driver, who comes to mind? How old is he? What’s she like? Whoever it is, along with that image comes a perceived favored 2016 presidential candidate. Harnessing AffinIDSM and the results of our 2016 Consumer Identity Research, we found a skew towards one of the candidates for nearly every one of the 90 brands we tested.  Read Erica Carranza’s post and check out brands yourself with our interactive dashboard. Interested in learning more? Join Erica for our upcoming webinar: The Key to Consumer-Centricity: Your Brand User Image  
    2. During introspection, it’s easy to focus on our weaknesses. But what if we put all that energy towards our strengths? Blair Bailey discusses the benefits of Strength-Based Leadership—realizing growth potential in developing our strengths rather than focusing on our weaknesses. In 2017, let’s all take a page from Blair’s book and concentrate on what we’re good at instead of what we aren’t.
    3. Did you attend a conference in 2016? Going to any in 2017? CMB’s Business Development Lead, Julie Kurd, maps out a game plan to get the most ROI from attending a conference. Though this post is specific to TMRE, these recommendations could be applied to any industry conference where you’re aiming to garner leads and build relationships. 
    4. In 2016 we released the results of our Social Currency research – a five industry, 90 brand study to identify which consumer behaviors drive equity and Social Currency. Of the industry reports, one of our favorites is the beer edition. So pull up a stool, grab a pint, and learn from Ed Loessi, Director of Product Development and Innovation, how Social Currency helps insights pros and marketers create content and messaging that supports consumer identity.
    5. It’s a mobile world and we’re just living in it. Today we (yes, we) expect to use our smartphones with ease and have little patience for poor design. And as market researchers who depend on a quality pool of human respondents, the trend towards mobile is a reality we can’t ignore. CMB’s Director of Field Services, Jared Huizenga, weighs in on how we can adapt to keep our smart(phone) respondents happy – at least long enough for them to “complete” the study. 
    6. When you think of “innovation,” what comes to mind? The next generation iPhone? A self-driving car? While there are obvious tangible examples of innovation, professional service agencies like CMB are innovating, too. In fact, earlier this year we hired Ed Loessi to spearhead our Product Development and Innovation team. Sr. Research Associate, Lauren Sears, sat down with Ed to learn more about what it means for an agency like CMB to be “innovative.” 
    7. There’s something to be said for “too much of a good thing” – information being one of those things. To help manage the data overload we (and are clients) are often exposed to, Project Manager, Jen Golden, discusses the merits of focusing on one thing at a time (or research objective), keeping a clear space (or questionnaire) and avoiding trending topics (or looking at every single data point in a report). 
    8. According to our 2016 study on millennials and money, women ages 21-30 are driven, idealistic, and feel they budget and plan well enough. However, there’s a disparity when it comes to confidence in investing: nearly twice as many young women don’t feel confident in their investing decisions compared to their male counterparts. Lori Vellucci discusses how financial service providers have a lot of work to do to educate, motivate and inspire millennial women investors. 
    9. Admit it, you can’t get enough of Prince William and Princess Kate. The British Royals are more than a family – they’re a brand that’s embedded itself into the bedrock of American pop culture. So if the Royals can do it, why can’t other British brands infiltrate the coveted American marketplace, too? Before a brand enters a new international market, British native and CMB Project Manager, Josh Fortey, contends, the decision should be based on a solid foundation of research.
    10. We round out our list with a favorite from our “Dear Dr. Jay Series.” When considering a product, we often focus on its functional benefits. But as Dr. Jay, our VP of Advanced Analytics and Chief Methodologist, explains, the emotional attributes (how the brand/product makes us feel) are about as predictive of future behaviors of the functional benefits of the product. So brands, let's spread the love!

We thank you for being a loyal reader throughout 2016. Stay tuned because we’ve got some pretty cool content for 2017 that you won’t want to miss.

From everyone at CMB, we wish you much health and success in 2017 and beyond.

PS - There’s still time to make your New Year’s Resolution! Become a better marketer in 2017 and signup for our upcoming webinar on consumer identity:

Register Now!

 

Savannah House is a Senior Marketing Coordinator at CMB. A lifelong aspiration of hers is to own a pet sloth, but since the Boston rental market isn’t so keen on exotic animals, she’d settle for a visit to the Sloth Sanctuary in Costa Rica.

 

Topics: strategy consulting, advanced analytics, methodology, consumer insights

Dear Dr. Jay: When To Stat Test?

Posted by Dr. Jay Weiner

Wed, Oct 26, 2016

Dear Dr. Jay,

The debate over how and when to test for statistical significance comes up nearly every engagement. Why wouldn’t we just test everything?

-M.O. in Chicago


 DRJAY.pngHi M.O.-

You’re not alone. Many clients want all sorts of things stat tested. Some things can be tested while others can’t. But for what can be tested, as market researchers we need to be mindful of two potential errors in hypothesis testing. Type I errors are when we reject a true null hypothesis. For example, if we accept the claim that Coke tastes better than Pepsi, it’s erroneous because in fact, it’s not true.

A type II error occurs when we accept the null hypothesis when in fact it is false. This part is safe to install and then the plane crashes. We choose the probability of committing a type I error when we choose alpha (say .05). The probability of a type II error is a function of power. We seldom take this side of the equation into account for good reason. Most decisions we make in market research don’t come with a huge price tag if we’re wrong. Hardly anyone ever dies if the results of the study are wrong. The goal in any research is to minimize both types of errors. The best way to do that is to use a larger sample.

This conundrum perfectly illustrates my “Life is a conjoint” mantra. While testing we’re always trading off between the accuracy of the results with the cost of executing a study with a larger sample. Further, we also tend to violate the true nature of hypothesis testing. More often than not, we don’t formally state a hypothesis. Rather, we statistically test everything and then report the statistical differences.

Consider this: when we compare two scores, we accept that we might get a statistical difference of 5% of the time simply by chance (a=.05). This could be the difference in concept acceptance between males and females.

In fact, that’s not really what we do, we perform hundreds of tests in most every study. Let’s say we have five segments and we want to test them for differences in concept acceptance. That’s 10 t-tests. Now we have a 29% chance of flagging a difference simply due to chance. That’s in every row of our tables. The better test would be to run an analysis of variance on the table to determine if any cell might be different. Then build a hypothesis and test them one at a time. But we don’t do this because it takes too much time. I realize I’m not going to change the way our industry does things (I’ve been trying for years), but maybe, just maybe you’ll pause for a moment when looking at your tables to decide if this “statistical” significance is really worth reporting—are the results valid and are they useful?.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

Got a burning research question? You can send your questions to DearDrJay@cmbinfo.com or submit anonymously here:

Ask Dr. Jay!

 

 

Topics: advanced analytics, Dear Dr. Jay

Passive Mobile Behavioral Data – Part Deux

Posted by Chris Neal

Wed, Aug 10, 2016

Over the past two years, we've  embarked on a quest to help the insights industry get better at harnessing passive mobile behavioral data. In 2015, we partnered with Research Now for an analysis37824990_thumbnail.jpg of mobile wallet usage, using unlinked passive and survey-based data. This year, we teamed up with Research Now once again for research-on-research directly linking actual mobile traffic and app data to consumers’ self-reported online shopper journey behavior.

We asked over 1,000 shoppers, across a variety of Black Friday/Cyber Monday categories, a standard set of purchase journey survey questions immediately after the event, then again after 30 days, 60 days, and 90 days. We then compared their self-reported online and mobile behavior to the actual mobile app and website usage data from their smartphones. 

The results deepened our understanding of how best to use (and not use) each respective data source, and how combining both can help our clients get closer to the truth than they could using any single source of information.

Here are a few things to consider if you find yourself tasked with a purchase journey project that uses one or both of these data sources as fuel for insights and recommendations:

  1. Most people use multiple devices for a major purchase journey, and here’s why you should care:
    • Any device tracking platform (even one claiming a 3600 view) is likely missing some relevant online behavior to a given shopper journey. In our study, we were getting behavior from their primary smartphone, but many of these consumers reported visiting websites we had no record of from our tracking data. Although they reported visiting these websites on their smartphones, it is likely that some of these visits happened on their personal computer, a tablet, a computer at their work, etc.
  2. Not all mobile usage is related to the purchase journey you care about:
    • We saw cases of consumers whose behavioral data showed they’d visited big retail websites and mobile apps during the purchase journey but who did not report using these sites/apps as part of the journey we asked them about. This is a bigger problem with larger, more generalist mobile websites and apps (like Amazon, for this particular project, or like PayPal when we did the earlier Mobile Wallet study with a similar methodological exercise).
  3. Human recall ain’t perfect. We all know this, but it’s important to understand when and where it’s less perfect, and where it’s actually sufficient for our purposes. Using survey sampling to analyze behaviors can be enormously valuable in a lot of different situations, but understand the limitations and when you are expecting too much detail from somebody to give you accurate data to work with.  Here are a few situations to consider:
    • Asking whether a given retailer, brand, or major web property figured into the purchase journey at all will give you pretty good survey data to work with. Smaller retailers, websites, and apps will get more misses/lack of recall, but accurate recall is a proxy for influence, and if you’re ultimately trying to figure out how best to influence a consumer’s purchase journey, self-reported recall of visits is a good proxy, whereas relying on behavioral data alone may inflate the apparent impact of smaller properties on the final purchase journey.
    • Asking people to remember whether they used the mobile app vs. the mobile website introduces more error in your data. Most websites are now mobile optimized and look/ feel like mobile apps, or will switch users to the native mobile app on their phone automatically if possible.
      • In this particular project, we saw evidence of a 35-50% improvement in survey-behavior match rates if we did not require respondents to differentiate the mobile website from the mobile app for the same retailer.
  4. Does time-lapse matter? It depends.
    • For certain activities (e.g., making minor purchases in grocery store, a TV viewing occasion), capturing in-the-moment feedback from consumers is critical for accuracy.
    • In other situations where the process is bigger, involves more research, or is more memorable in general (e.g., buying a car, having a wedding, or making a planned-for purchase based on a Black Friday or Cyber Monday deal): you can get away with asking people about it further out from the actual event.
      • In this particular project, we actually found no systematic evidence of recall deterioration when we ran the survey immediately after Black Friday/Cyber Monday vs. running it 30 days, 60 days, and 90 days after.

Working with passive mobile behavioral data (or any digital passive data) is challenging, no doubt.  Trying to make hay by combining these data with primary research survey sampling, customer databases, transactional data, etc., can be even more challenging.  But, like it or not, that’s where Insights is headed. We’ll continue to push the envelope in terms of best practices for navigating these types of engagements as Analytics teams, Insights departments, Financial Planning and Strategy groups work together more seamlessly to provide senior executives with a “single version of the truth”— one which is more accurate than any previously siloed version.

Chris Neal leads CMB’s Tech Practice. He knows full well that data scientists and programmatic ad buying bots are analyzing his every click on every computing device and is perfectly OK with that as long as they serve up relevant ads. Nothing to hide!

Don't miss out on the latest research, insights and conference recaps. Subscribe to our monthly eZine.

Subscribe Here!

Topics: advanced analytics, mobile, passive data, integrated data

Big Data Killed the Radio Star

Posted by Mark Doherty

Wed, Jun 29, 2016

It’s an amazing time to be a music fan (especially if you have all those Ticketmaster vouchers and a love of '90's music). While music production and distribution was once controlled by record label and radio station conglomerates, technology has “freed” it in almost every way. It’s 200542299-001_47.jpgnow easy to hear nearly any song ever recorded thanks to YouTube, iTunes, and a range of streaming sources. While these new options appear to be manna from heaven, for music lovers, they can  actually create more problems than you’d expect. The never-ending flow of music options can make it harder to decide what might be good or what to play next. In the old days (way back in 2010 :)), your music choices were limited by record companies and by radio station programmers. While these “corporate suits” may have prevented you from hearing that great underground indie band, they also “saved” you from thousands of options that you would probably hate. 

That same challenge is happening right now with marketers’ use of data. Back in the day (also around 2010), there was a limited number of data sets and sources to leverage in decisions relating to building/strengthening a brand. Now, that same marketer has access to a seemingly endless flow of data: from web analytics, third-party providers, primary research, and their own CRM systems. While most market information was previously collected and “curated” through the insights department, marketing managers are often now left to their own devices to sift through and determine how useful each set of data is to their business. And it’s not easy for a non-expert to do due diligence on each data source to establish its legitimacy and usefulness. As a result, many marketers are paralyzed by a firehose of data and/or end up trying to use lots of not-so-great data to make business decisions.

So, how do managers make use of all this data? It’s partly the same way streaming sources help music listeners decide what song to play next: predictive analytics. Predictive analytics is changing how companies use data to get, keep, and grow their most profitable customers. It helps managers “cut through the clutter” and analyze a wide range of data to make better decisions about the future of their business. It’s similarly being used in the music industry to help music lovers cut through the clutter of their myriad song choices to find their next favorite song. Pandora’s Musical Genome Project is doing just that by developing a recommendation algorithm that serves up choices based on the attributes of the music you have listened to in the past. Similarly, Spotify’s Discover Weekly playlist is a huge hit with music lovers, who appreciate Spotify’s assistance in identifying new songs they may love.

So, the next time you need to figure out how to best leverage the range of data you have—or find a new summer jam—consider predictive analytics.

Mark is a Vice President at CMB, he’s fully embracing his reputation around the office as the DJ of the Digital Age.

Did you miss our recent webinar on the power of Social Currency measurement to help brands activate the 7 levers that encourage consumers to advocate, engage, and gain real value? You're not out of luck:

 Watch Here

 

Topics: advanced analytics, big data, data integration, predictive analytics

Dear Dr. Jay: Driver Modeling

Posted by Dr. Jay Weiner

Thu, Jun 23, 2016

Dear Dr. Jay,

We want to assess the importance of fixing some of our customer touchpoints, what would you recommend as a modeling tool?

 -Alicia


Hi Alicia,

DRJAY.pngThere are a variety of tools we use to determine the relative importance of key variables on an outcome (dependent variable). Here’s the first question we need to address: are we trying to predict the actual value of the dependent variable or just assess the importance of any given independent variable in the equation? Most of the time, the goal is the latter.

Once we know the primary objective, there are three key criteria we need to address. The first is the amount of multicollinearity in our data. The more independent variables we have, the bigger problem this presents. The second is the stability in the model over time. In tracking studies, we want to believe that the differences between waves are due to actual differences in the market and not artifacts of the algorithm used to compute the importance scores. Finally, we need to understand the impact of sample size on the models.

How big a sample do you need? Typically, in consumer research, we see results stabilize with n=200. Some tools will do a better job with smaller samples than others. You should also consider the number of parameters you are trying to model. A grad school rule of thumb is that you need 4 observations for each parameter in the model, so if you have 25 independent variables, you’d need at least 100 respondents in your sample.

There are several tools to consider using to estimate relative importance: Bivariate Correlations, OLS, Shapley Value Regression (or Kruskal’s Relative Importance), TreeNet, and Bayesian Networks are all options. All of these tools will let you understand the relative importance of the independent variables in predicting your key measure. One think to note is that none of the tools specifically model causation. You would need some sort of experimental design to address that issue. Let’s break down the advantages and disadvantages of each. 

Bivariate Correlations (measures the strength of the relationship between two variables)

  • Advantages: Works with small samples. Relatively stable wave to wave. Easy to execute. Ignores multicollinearity.
  • Disadvantages: Only estimates the impact of one attribute at a time. Ignores any possible interactions. Doesn’t provide an “importance” score, but a “strength of relationship” value.  Assumes a linear relationship among the attributes. 

Ordinary Least Squares regression (OLS) (method for estimating the unknown parameters in a linear regression model)

  • Advantages: Easy to execute. Provides an equation to predict the change in the dependent variable based on changes in the independent variable (predictive analytics).
  • Disadvantages: Highly susceptible to multicollinearity, causing changes in key drivers in tracking studies. If the goal is a predictive model, this isn’t a serious problem. If your goal is to prioritize areas of improvement, this is a challenge. Assumes a linear relationship among the attributes. 

Shapley Value Regression or Kruskal’s Relative Importance

These are a couple of approaches that consider all possible combinations of explanatory variables. Unlike traditional regression tools, these techniques are not used for forecasting. In OLS, we predict the change in overall satisfaction for any given change in the independent variables. These tools are used to determine how much better the model is if we include any specific independent variable versus models that do not include that measure. The conclusions we draw from these models refer to the usefulness of including any measure in the model and not its specific impact on improving measures like overall satisfaction. 

  • Advantages: Works with smaller samples. Does a better job of dealing with multicollinearity. Very stable in predicting the impact of attributes between waves.
  • Disadvantages: Ignores interactions. Assumes a linear relationship among the attributes.

TreeNet (a tree-based data mining tool)

  • Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes.
  • Disadvantages: Requires a larger sample size—usually n=200 or more. 

Bayesian Networks (a graphical representation of the joint probabilities among key measures)

  • Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes. Works with smaller samples. While a typical Bayes Net does not provide a system of equations, it is possible to simulate changes in the dependent variable based on changes to the independent variables.
  • Disadvantages: Can be more time-consuming and difficult to execute than the others listed here.

Got a burning research question? You can send your questions to DearDrJay@cmbinfo.com or submit anonymously here.

Dr. Jay Weiner is CMB’s senior methodologist and VP of Advanced Analytics. Jay earned his Ph.D. in Marketing/Research from the University of Texas at Arlington and regularly publishes and presents on topics, including conjoint, choice, and pricing.

Topics: advanced analytics, Dear Dr. Jay

How I Used Conjoint Analysis to Plan My Wedding

Posted by Alyse Dunn

Tue, Jun 14, 2016

I’m getting married in August, and the past year and a half of planning has been a whirlwind of fabrics, colors, and decisions. The number of options you have for any given item are immense, and, as a market researcher, I began to consider the choices I had and how I would make them. 

Let’s talk about cake. We tried 15 flavors of cake, and we knew that we could combine any four of them. They could be the same, or we could have 4 different flavors or a combination. Effectively, we had 3,060 possible combinations for cake. Now, that could be very overwhelming, but, to me, it was just a giant Conjoint Analysis exercise.

Conjoint Analysis is a trade-off technique that market researchers use to estimate consumer preferences for products with multiple features. The beauty of Conjoint Analysis is that it allows a researcher to predict preferences for huge numbers of possible product combinations without testing each combination explicitly.  The secret is in attaching a value to each level (chocolate, vanilla, strawberry, etc.) to each attribute (flavor) and making the assumption that the value of the whole is equal to the sum of its parts. For our wedding cake, we were presented with 2 attributes: Flavor and Number of Flavor Repeats.

For this Self-Explicated Conjoint exercise, I listed out the 15 possible flavors and number of possible repeated flavors. I then rated them on a 1-10 scale based on how attractive they were to me. Additionally, I rated each attribute based on how important it was to the final decision. In the case below, the number of repeated flavors was a more important attribute than flavor (60% of my decision). Finally, I multiplied the level and attribute values together to get a utility score.

wedding_conjoint_analysis.png

From there, it’s math! Now, with these scores, I have the ability to simulate all 3,060 cake combinations with their values (that’s a lot of frosting). To determine the “BEST CAKE” you add the utilities together and look for the highest total utility. In our case, it was 2 White Chocolate Tiers, with 1 Lavender, and 1 Italian Crème, with a total utility of 2,060. This very narrowly beat out 4 independent flavors (White Chocolate, Lavender, Italian Crème, and Chocolate) because of the high value for White Chocolate. 

Conjoint Analysis is helpful for numerous research needs (wedding planning included). Presenting individuals with various combinations of attributes helps determine how each attribute is valued, which can be projected to the larger population. By making tradeoffs when comparing different combinations, I was able to choose a cake that worked for our event. For organizations, Conjoint Analysis can help determine which new product features will perform the best, which hotel packages offer the biggest bang for the buck, or which insurance items will be most desirable to individuals. Conjoint is applicable across any organization and is a valuable analytical tool to help determine which combinations of attributes perform best. 

Learn more about avoiding common pitfalls in Conjoint Analysis. 

Alyse Dunn is a Data Manager at CMB, and she looks forward to how her Conjoint Analysis exercises in wedding planning will pay off (and thanks our Senior Analyst Liz White for socializing this example).

Topics: advanced analytics, research design

Dear Dr. Jay: Discrete Choice—How Many Is Too Many Features?

Posted by Dr. Jay Weiner

Wed, Mar 23, 2016

Dear Dr. Jay,

I’m interested in testing a large number of features for inclusion in the next version of my product. My team is suggesting that we need to cull the list down to a smaller set of items to run a choice model. Are there ways to test a large set of attributes in a choice model?

-Nick


 DRJAY.pngHi Nick –

There are a number of ways to test a large set of attributes in choice modeling. Most of the time, when we test a large number of features, many are simply binary attributes (included/not included). While this makes the experimental design larger, it’s not quite as bad as having ten six-level attributes. If the description is short enough, you might go ahead and just include all of them. If you’re concerned about how much reading a respondent will need to do—or you really wouldn’t offer a respondent 12 additional perks for choosing your credit card—you could put a cap on the number of additional features any specific offer includes. For example, you could test 15 new features in a single model, but respondents would only get up to 5 at any single time. This is actually better than using a partial profile design as all respondents would see all offers. 

Another option is to do some sort of bridging study where you test all of the features using a max diff task. You can include a subset of the factors in a DCM and then use the max diff utilities to compute the utility for the full list of features in the DCM. This allows you to include the full set of features in your simulation tool.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

Topics: advanced analytics, product development, Dear Dr. Jay