WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

Spring into Data Cleaning

Posted by Nicole Battaglia

Tue, Apr 04, 2017

scrubbing.jpegWhen someone hears “spring cleaning” they probably think of organizing their garage, purging clothes from their closet, and decluttering their workspace. For many, spring is a chance to refresh and rejuvenate after a long winter (fortunately ours in Boston was pretty mild).

This may be my inner market researcher talking, but when I think of spring cleaning, the first that comes to mind is data cleaning. Like cleaning and organizing your home, data cleaning is a detailed and lengthy process that is relevant to researchers and their clients.

Data cleaning is an arduous task. Each completed questionnaire must be checked to ensure that it's been answered correctly, clearly, truthfully, and consistently. Here’s what we typically clean:

  • We’ll look at each open-ended response in a survey to make sure respondents’ answers are coherent and appropriate. Sometimes respondents will curse, other times they'll write outrageously irrelevant answers like what they’re having for dinner, so we monitor these closely. We do the same for open-ended numeric responsesthere’s always that one respondent who enters ‘50’ when asked how many siblings they have.
  • We also check for outliers in open-ended numeric responses. Whether it’s false data or an exceptional respondent (e.g. Bill Gates), outliers can skew our data and lead us to draw the wrong conclusions and make more recommendations to clients. For example, I worked on a survey that asked respondents how many cars they own.  Anyone who provided a number that was three standard deviations above the mean was set as an outlier because their answers would’ve significantly impacted our interpretation of the average car ownershipthe reality is the average household owns two cars, not six.
  • Straightliners are respondents who answer a battery of questions on the same scale with the same response. Because of this, sometimes we’ll see someone who strongly agrees or disagrees with two completely opposing statements—making it difficult to trust these answers reflect the respondent’s real opinion.
  • We often insert a Red Herring Fail into our questionnaires to help identify and weed out distracted respondents. A Red Herring Fail is a 10-point scale question usually placed around the halfway mark of a questionnaire that simply asks respondents to select the number “3” on the scale. If they select a number other than “3”, we flag them for removal.
  • If there’s incentive to participate in a questionnaire, someone may feel inclined to participate more than once. So to ensure our completed surveys are from unique individuals, we check for duplicate IP addresses and respondent IDs.

There are a lot of variables that can skew our data, so our cleaning process is thorough and thoughtful. And while the process may be cumbersome, here’s why we clean data: 

  • Impression on the clientFollowing a detailed data cleaning processes helps show that your team is cautious, thoughtful, and able to accurately dissect and digest large amounts of data. This demonstration of thoroughness and competency goes a long way to building trust in the researcher/client relationship because the client will see their researchers are working to present the best data possible.
  • Helps tell a better storyWe pride ourselves on storytelling–using insights from data and turning them into strong deliverablesto help our clients make strategic business decisions. If we didn’t have accurate and clean data, we wouldn’t be able to tell a good story!
  • Overall, ensures high quality and precise dataAt CMB typically two or more researchers are working on the same data file to mitigate the chance of error. The data undergoes such scrutiny so that any issues or mistakes can be noted and rectified, ensuring the integrity of the report.

The benefits of taking the time to clean our data far outweigh the risks of skipping it. Data cleaning keeps false or unrepresentative information from influencing our analyses or recommendations to a client and ensures our sample accurately reflects the population of interest.

So this spring, while you’re finally putting away those holiday decorations, remember that data cleaning is an essential step in maintaining the integrity of your work.

Nicole Battaglia is an Associate Researcher at CMB who prefers cleaning data over cleaning her bedroom.

Topics: data collection, quantitative research

A Lesson in Storytelling from the NFL MVP Race

Posted by Jen Golden

Thu, Feb 02, 2017

american football.jpg

There’s always a lot of debate in the weeks leading up to the NFL’s announcement of its regular season MVP. While the recipient is often from a team with a strong regular season record, it’s not always that simple. Of course the MVP's season stats are an important factor in who comes out on top, but a good story also influences the outcome. 

Take this year, we have a few excellent contenders for the crown, including…

  • Ezekiel Elliot, the rookie running back on the Dallas Cowboys
  • Tom Brady, the NE Patriots QB coming back from a four game “Deflategate” suspension
  • Matt Ryan, the Atlanta Falcons veteran “nice-guy” QB having a career year

Ultimately, deciding the winner is a mix of art and science. And while you’re probably wondering what this has to do with market research, the NFL regular season MVP selection process has a few important things in common with the creation of a good report. [Twitter bird-1.pngTweet this!]

First, make a framework: Having a framework for your research project can help keep you from feeling overwhelmed by the amount of data in front of you. In the MVP race, for example, voters should start by listing attributes they think make an MVP: team record, individual record, strength of schedule, etc. These attributes are a good way to narrow down potential candidates. In research, the framework might include laying out the business objectives and the data available for each. This outline helps focus the narrative and guide the story’s structure.

Then, look at the whole picture: Once the data is compiled, take a step back and think about how the pieces relate to one another and the context of each. Let’s look at Tom Brady’s regular season stats as an example. He lags behind league leaders on total passing yards and TDs, but remember that he missed four games with a suspension. When the regular season is only 12 games, missing a quarter of those was a missed opportunity to garner points, so you can’t help but wonder if it’s a fair comparison to make. Here’s where it’s important to look at the whole picture (whether we’re talking about research or MVP picks). If you don’t have the entire context, you could dismiss Brady altogether. In research, a meaningful story builds on all the primary data within larger social, political, and/or business contexts.

Finally, back it up with facts:  Once the pieces have come together, you need to back up your key storyline (or MVP pick) with facts to prove your credibility. For example, someone could vote for Giants wide receiver Odell Beckham Jr. because of an impressive once-in-a-lifetime catch he made during the regular season. But beyond the catch there wouldn’t be much data to support that he was more deserving than the other candidates. In a research report, you must support your story with solid data and evidence.  The predictions will continue until the 2016 regular season MVP is named, but whoever that ends up being, he will have a strong story and the stats to back it up.

 Jen is a Sr. PM on the Technology/E-commerce team. She hopes Tom Brady will take the MVP crown to silence his “Deflategate” critics – what a story that would be.

Topics: data collection, storytelling, marketing science

Dear Dr. Jay: HOW can we trust predictive models after the 2016 election?

Posted by Dr. Jay Weiner

Thu, Jan 12, 2017

Dear Dr. Jay,

After the 2016 election, how will I ever be able to trust predictive models again?

Alyssa


Dear Alyssa,

Data Happens!

Whether we’re talking about political polling or market research, to build good models, we need good inputs. Or as the old saying goes: “garbage in, garbage out”.  Let’s look at all the sources of error in the data itself:DRJAY-9-2.png

  • First, we make it too easy for respondents to say “yes” and “no” and they try to help us by guessing what answer we want to hear. For example, we ask for purchase intent to a new product idea. The respondent often overstates the true likelihood of buying the product.
  • Second, we give respondents perfect information. We create 100% awareness when we show the respondent a new product concept.  In reality, we know we will never achieve 100% awareness in the market.  There are some folks who live under a rock and of course, the client will never really spend enough money on advertising to even get close.
  • Third, the sample frame may not be truly representative of the population we hope to project to. This is one of the key issues in political polling because the population is comprised of those who actually voted (not registered voters).  For models to be correct, we need to predict which voters will actually show up to the polls and how they voted.  The good news in market research is that the population is usually not a moving target.

Now, let’s consider the sources of error in building predictive models.  The first step in building a predictive model is to specify the model.  If you’re a purist, you begin with a hypotheses, collect the data, test the hypotheses and draw conclusions.  If we fail to reject the null hypotheses, we should formulate a new hypotheses and collect new data.  What do we actually do?  We mine the data until we get significant results.  Why?  Because data collection is expensive.  One possible outcome from continuing to mine the data looking for a better model is a model that is only good at predicting the data you have and not too accurate in predicting the results using new inputs. 

It is up to the analyst to decide what is statistically meaningful versus what is managerially meaningful.  There are a number of websites where you can find “interesting” relationships in data.  Some examples of spurious correlations include:

  • Divorce rate in Maine and the per capita consumption of margarine
  • Number of people who die by becoming entangled in their bedsheets and the total revenue of US ski resorts
  • Per capita consumption of mozzarella cheese (US) and the number of civil engineering doctorates awarded (US)

In short, you can build a model that’s accurate but still wouldn’t be of any use (or make any sense) to your client. And the fact is, there’s always a certain amount of error in any model we build—we could be wrong, just by chance.  Ultimately, it’s up to the analyst to understand not only the tools and inputs they’re using but the business (or political) context.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

PS – Have you registered for our webinar yet!? Join Dr. Erica Carranza as she explains why to change what consumers think of your brand, you must change their image of the people who use it.

What: The Key to Consumer-Centricity: Your Brand User Image

When: February 1, 2017 @ 1PM EST

Register Now!

 

 

Topics: methodology, data collection, Dear Dr. Jay, predictive analytics

A New Year’s Resolution: Closing the Gap Between Intent and Action

Posted by Indra Chapman

Wed, Jan 04, 2017

resolutions.jpg

Are you one of the more than 100 million adults in the U.S. who made a New Year’s resolution? Do you resolve to lose weight, exercise more, spend less and save more, or just be a better person?

Top 10 New Year's Resolutions for 2016:

  • Lose Weight
  • Getting Organized
  • Spend less, save more
  • Enjoy Life to the Fullest
  • Staying Fit and Healthy
  • Learn Something Exciting
  • Quit Smoking
  • Help Others in Their Dreams
  • Fall in Love
  • Spend More Time with Family
[Source: StatisticBrain.com]

The actual number varies from year to year, but generally more than four out of 10 of us make some type of resolution for the New Year. And now that we’re a few days into 2017, we’re seeing the impact of those New Year resolutions. Gyms and fitness classes are crowded (Pilates anyone?), and self-improvement and diet book sales are up.

But… (there’s that inevitable but!), despite the best of intentions, within a week, at least a quarter of us have abandoned that resolution, and by the end of the month, more than a third of us have dropped out of the race. In fact, several studies suggest that only 8% of us actually go on to achieve our resolutions. Alas, we see that behavior no longer follows intention.

It’s not so different in market research because we see the same gap between consumer intention and behavior. Sometimes the gap is fairly small, and other times it’s substantial. Consumers (with the best of intentions) tell us what they plan to do, but their follow through is not always consistent. This, as you might imagine, can lead to bad data. [ twitter icon-1.pngTweet this!]

So what does this mean?

To help close the gap and gather more accurate data, ask yourself the following questions when designing your next study:

  • What are the barriers to adoption or the path to behavior? Are there other factors or elements within the customer journey to consider?
  • Are you assessing the non-rational components? Are there social, psychological or economic implications to them following through with that rational selection? After all, consider that many of us know that exercising daily is good for us – but so few of us follow through.
  • Are there other real life factors that you should consider in analysis of the survey? Does the respondent’s financial situation make that preference more aspirational than intentional?

So what are your best practices for closing the gap between consumer intent and action? If you don’t already have a New Year’s resolution (or if you do, add this one!), why not resolve to make every effort to connect consumer intent to behavior in your studies during 2017.

Another great resolution is to become a better marketer!  How?

Register for our upcoming webinar with Dr. Erica Carranza on consumer identity and the power of measuring brand user image to help create meaningful and relevant messaging for your customers and prospects:

Register Now!

Indra Chapman is a Senior Project Manager at CMB, who has resolved to set goals in lieu of new year’s resolutions this year. In the words of Brad Paisley, the first day of the new year “is the first blank page of a 365-page book. Write a good one.”

Topics: data collection, research design

What We’ve Got Here Is a Respondent Experience Problem

Posted by Jared Huizenga

Thu, Apr 14, 2016

respondent experience problemA couple weeks ago, I was traveling to Austin for CASRO’s Digital Research Conference, and I had an interesting conversation while boarding the plane. [Insert Road Trip joke here.]

Stranger: First time traveling to Austin?

Me: Yeah, I’m going to a market research conference.

Stranger: [blank stare]

Me: It’s a really good conference. I go every year.

Stranger: So, what does your company do?

Me: We gather information from people—usually by having them take an online survey, and—

Stranger: I took one of those. Never again.

Me: Yeah? It was that bad?

Stranger: It was [expletive] horrible. They said it would take ten minutes, and I quit after spending twice that long on it. I got nothing for my time. They basically lied to me.

Me: I’m sorry you had that experience. Not all surveys are like that, but I totally understand why you wouldn’t want to take another one.

Thank goodness the plane started boarding before he could say anything else. Double thank goodness that I wasn’t sitting next to him during the flight.

I’ve been a proud member of the market research industry since 1998. I feel like it’s often the Rodney Dangerfield of professional services, but I’ve always preached about how important the industry is. Unfortunately, I’m finding it harder and harder to convince the general population. The experience my fellow traveler had with his survey points to a major theme of this year’s CASRO Digital Research Conference. Either directly or indirectly, many of the presentations this year were about the respondent experience. It’s become increasingly clear to me that the market research industry has no choice other than to address the respondent experience “problem.”

There were also two related sub-themes—generational differences and living in a digital world—that go hand-in-hand with the respondent experience theme. Fewer people are taking questionnaires on their desktop computers. Recent data suggests that, depending on the specific study, 20-30% of respondents are taking questionnaires on their smartphones. Not surprisingly, this skews towards younger respondents. Also not surprisingly, the percentage of smartphone survey takers is increasing at a rapid pace. Within the next two years, I predict the percent of smartphone respondents will be 35-40%. As researchers, we have to consider the mobile respondent when designing questionnaires.

From a practical standpoint, what does all this mean for researchers like me who are focused on data collection?

  1. I made a bold—and somewhat unpopular—prediction a few years ago that the method of using a single “panel” for market research sample is dying a slow death and that these panels would eventually become obsolete. We may not be quite at that point yet, but we’re getting closer. In my experience, being able to use a single sample source today is very rare except for the simplest of populations.

Action: Understand your sample source options. Have candid conversations with your data collection partners and only work with ones that are 100% transparent. Learn how to smell BS from a mile away, and stay away from those people.

  1. As researchers, part of our job should be to understand how the world around us is changing. So, why do we turn a blind eye to the poor experiences our respondents are having? According to CASRO’s Code of Standards and Ethics, “research participants are the lifeblood of the research industry.” The people taking our questionnaires aren’t just “completes.” They’re people. They have jobs, spouses, children, and a million other things going on in their lives at any given time, so they often don’t have time for your 30-minute questionnaire with ten scrolling grid questions.

Action: Take the questionnaires yourself so you can fully understand what you’re asking your respondents to do. Then take that same questionnaire on a smartphone. It might be an eye opener.

  1. It’s important to educate colleagues, peers, and clients regarding the pitfalls of poor data collection methods. Not only does a poorly designed 30-minute survey frustrate respondents, it also leads to speeding, straight lining, and just not caring. Most importantly, it leads to bad data. It’s not the respondent’s fault—it’s ours. One company stood up at the conference and stated that it won’t take a client project if the survey is too long. But for every company that does this, there are many others that will take that project.

Action: Educate your clients about the potential consequences of poorly designed, lengthy questionnaires. Market research industry leaders as a whole need to do this for it have a large impact.

Change is a good thing, and there’s no need to panic. Most of you are probably aware of the issues I’ve outlined above. There are no big shocks here. But, being cognizant of a problem and acting to fix the problem are two entirely different things. I challenge everyone in the market research industry to take some action. In fact, you don’t have much of a choice.

Jared is CMB’s Field Services Director, and has been in market research industry for eighteen years. When he isn’t enjoying the exciting world of data collection, he can be found competing at barbecue contests as the pitmaster of the team Insane Swine BBQ.

Topics: data collection, mobile, research design, conference recap