WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

Does your metric have a home(plate)?

Posted by Youme Yai

Thu, Sep 28, 2017

baseball.jpeg

Last month I attended a Red Sox/Yankees matchup at Fenway Park. By the seventh inning, the Sox had already cycled through seven pitchers. Fans were starting to lose patience and one guy even jumped on the field for entertainment. While others were losing interest, I stayed engaged in the game—not because of the action that was (not) unfolding, but because of the game statistics.

Statistics have been at the heart of baseball for as long as the sport’s been around. Few other sports track individual and team stats with such precision and detail (I suggest reading Michael Lewis’ Moneyball if you haven’t already). As a spectator, you know exactly what’s happening at all times, and this is one of my favorite things about baseball. As much as I enjoy watching the hits, runs, steals, strikes, etc., unfold on the field, it’s equally fun to watch those plays translate into statistics—witnessing the rise and fall of individual players and teams.

Traditionally batting average (# of hits divided by number of at bats) and earned run average (# of earned runs allowed by a pitcher per nine innings) have dominated the statistical world of baseball, but there are SO many others recorded. There’s RBI (runs batted in), OPS (on-base plus slugging), ISO (isolated power: raw power of a hitter by counting only extra-base hits and type of hit), FIP (fielding independent pitching: similar to ERA but focuses solely on pitching, and removes results on balls hit into field of play), and even xFIP (expected fielding independent pitching; or in layman’s term: how a pitcher performs independent of how his teammates perform once the ball is in play, but also accounting for home runs given up vs. home run average in league). And that's just the tip of the iceberg. 

With all this data, sabermetrics can yield some unwieldy metrics that have little applicability or predictive power. And sometimes we see this happen in market research. There are times when we are asked to collect hard-to-justify variables in our studies. While it seems sensible to gather as much information as possible, there’s such a thing as “too much” where it starts to dilute the goal and clarity of the project.  

So, I’ll take off my baseball cap and put on my researcher’s hat for this: as you develop your questionnaire, evaluate whether a metric is a “nice to have” or a “need to have.” Here are some things to keep in mind as you evaluate your metrics:

  1. Determine the overall business objective: What is the business question I am looking to answer based on this research? Keep reminding yourself of this objective.
  2. Identify the hypothesis (or hypotheses) that make up the objective: What are the preconceived notions that will lead to an informed business decision?
  3. Establish the pieces of information to prove or disprove the hypothesis: What data do I need to verify the assumption, or invalidate it?
  4. Assess if your metrics align to the information necessary to prove or disprove one or more of your identified hypotheses.

If your metric doesn’t have a home (plate) in one of the hypotheses, then discard it or turn it into one that does. Following this list can make the difference in accumulating a lot of data that produces no actionable results, or one that meets your initial business goal.

Combing through unnecessary data points is cumbersome and costly, so be judicious with your red pen in striking out useless questions. Don’t get bogged down with information if it isn’t directly helping achieve your business goal. Here at CMB, we partner with clients to minimize this effect and help meet study objectives starting well before the data collection stage.

Youme Yai is a Project Manager at CMB who believes a summer evening at the ballpark is second to none.

 

Topics: advanced analytics, data collection, predictive analytics

Dear Dr. Jay: How To Predict Customer Turnover When Transactions are Anonymous

Posted by Dr. Jay Weiner

Wed, Apr 26, 2017

Dear Dr. Jay:

What's the best way to estimate customer turnover for a service business whose customer transactions are usually anonymous?

-Ian S.


Dear Ian,

You have posed an interesting question.  My first response was, “you can’t”. But as I think about it some more, you might already have some data in-house that could be helpful in addressing the issue.DRJAY-9-2 (1).png

It appears you are in the mass transit industry. Most transit companies offer single ride fares and monthly passes while companies in college towns often offer semester-long passes. Since oftentimes the passes (monthly, semester, etc.) are sold at a discounted rate, we might conclude that all the single fare revenues are turnover transactions.

This assumption is a small leap of faith as I’m sure some folks just pay the single fare price and ride regularly. Let’s consider my boss. He travels a fair amount and even with the discounted monthly pass, it’s often cheaper for him to pay the single ride fare. Me, I like the convenience of not having to make sure I have the correct fare in my pocket so I just pay the monthly rate, even if I don’t use it every day. We both might be candidates for weekly pass sales if we planned for those weeks when we know we’d be commuting every day versus working from home or traveling. I suspect the only way to get at that dimension would be to conduct some primary research to determine the frequency of ridership and how folks pay.

For your student passes, you probably have enough historic data in-house to compare your average semester pass sales to the population of students using them and can figure out if you see turnover in those sales. That leaves you needing to estimate the turnover on your monthly pass sales.

You also may have corporate sales that you could look at. For example, here at CMB, employees can purchase their monthly transit passes through our human resources department. Each month our cards are automatically updated so that we don’t have to worry about renewing it every few weeks.  I suspect if we analyzed the monthly sales from our transit system (MTBA) to CMB, we could determine the turnover rate.

As you can see, you could already have valuable data in-house that can help shed light on customer turnover. I’m happy to look at any information you have and let you know what options you might have in trying to answer your question.

Dr. Jay is CMB’s Chief Methodologist and VP of Advanced Analytics and holds a Zone 3 monthly pass to the MTBA.  If it wasn’t for the engineer, he wouldn’t make it to South Station every morning.

Keep those questions coming! Ask Dr. Jay directly at DearDrJay@cmbinfo.com or submit your question anonymously by clicking below:

Ask Dr. Jay!

Topics: advanced analytics, data collection, Dear Dr. Jay

If you can’t trust your sample sources, you can’t trust your data

Posted by Jared Huizenga

Wed, Apr 19, 2017

people with word bubbles-2.jpgDuring a recent data collection orientation for new CMB employees, someone asked me how we select the online sample providers we work with on a regular basis. Each week, my Field Services team receives multiple requests from sample providers—some we know from conferences, others from what we’ve read in industry publications, and some that are entirely new to us.

When vetting new sample providers, a good place to start is the ESOMAR 28 Questions to Help Buyers of Online Samples. Per the site, these questions “help research buyers think about issues related to online samples.”

An online sample provider should be able to answer the ESOMAR 28 questions; consider red flagging any that won’t. If their answers are too brief and don’t provide much insight into their procedures, it’s okay to ask them for more information, or just move along to the next. 

While all 28 questions are valuable, here are a few that I pay close attention to:

Please describe and explain the type(s) of online sample sources from which you get respondents. Are these databases?  Actively managed research panels?  Direct marketing lists?  Social networks?  Web intercept (also known as river) samples?  

Many online sample providers use multiple methods, so these options aren’t always exclusive. I’m a firm believer in knowing where the sample is coming from, but there isn’t necessarily one “right” answer to this question. Depending on the project and the population you are looking for, different methods may need to be used to get the desired results.

Are your sample source(s) used solely for market research? If not, what other purposes are they used for? 

Beware of providers that use sample sources for non-research purposes. If a provider states that they are using their sample for something other than research, at the very least you should probe them for more details so that you feel comfortable in what those other purposes are. Otherwise, pass on the provider.

Do you employ a survey router? 

A survey router is software that directs potential respondents to a questionnaire for which they may qualify. There are pros and cons to survey routers, and they have become such a touchy subject that several of the ESOMAR 28 questions are devoted to the topic of routers. I’m not a big fan of survey routers, since they can be easily abused by dishonest respondents. If a company uses a survey router as part of their standard practice, be sure you have a very clear understanding of how the router is used as well as any restrictions they place on router usage.

You should also be wary of any sample provider who tells you that your quality control (QC) measures are too strict. This happened to me a few years ago and, needless to say, it ended our relationship with the company. This is not to say that QC measures can’t be too restrictive, and in those cases you can actually be throwing out good data.

At CMB, we did a lot of research prior to implementing our QC standards.  We consulted peers and sample providers to get a good understanding of what was fair and reasonable in the market. We investigated speeding criteria, red herring options, and how to look at open-ended responses. We revisit these standards on a regular basis to make sure they are still relevant. 

Since each of our tried and true providers support our QC standards, when a new (to us) sample provider tells us we’re rejecting too many of their panelists due to poor quality, you can understand how that raises a red flag. Legitimate sample providers will appreciate the feedback on “bad” respondents because it helps them to improve the quality of their sample.

There are tons of online sample providers in the marketplace, but not every partner is a good fit for everyone. While I won’t make specific recommendations, I urge you to consider the three questions I referenced above when selecting your partner.

At Chadwick Martin Bailey, we’ve worked hard to establish trusted relationships with a handful of online sample providers. They’re dedicated to delivering high quality sample and have a true “partnership” mentality. 

In my world of data collection, recommending the best sample providers to my internal clients is extremely important. This is key to providing our clients with sound insights and recommendations that support confident, strategic decision-making. 

Jared Huizenga is CMB’s Field Services Director, and has been in market research industry for nineteen years. When he isn’t enjoying the exciting world of data collection, he can be found competing at barbecue contests as the pitmaster of the team Insane Swine BBQ.

 

 

Topics: methodology, data collection

Spring into Data Cleaning

Posted by Nicole Battaglia

Tue, Apr 04, 2017

scrubbing.jpegWhen someone hears “spring cleaning” they probably think of organizing their garage, purging clothes from their closet, and decluttering their workspace. For many, spring is a chance to refresh and rejuvenate after a long winter (fortunately ours in Boston was pretty mild).

This may be my inner market researcher talking, but when I think of spring cleaning, the first that comes to mind is data cleaning. Like cleaning and organizing your home, data cleaning is a detailed and lengthy process that is relevant to researchers and their clients.

Data cleaning is an arduous task. Each completed questionnaire must be checked to ensure that it's been answered correctly, clearly, truthfully, and consistently. Here’s what we typically clean:

  • We’ll look at each open-ended response in a survey to make sure respondents’ answers are coherent and appropriate. Sometimes respondents will curse, other times they'll write outrageously irrelevant answers like what they’re having for dinner, so we monitor these closely. We do the same for open-ended numeric responsesthere’s always that one respondent who enters ‘50’ when asked how many siblings they have.
  • We also check for outliers in open-ended numeric responses. Whether it’s false data or an exceptional respondent (e.g. Bill Gates), outliers can skew our data and lead us to draw the wrong conclusions and make more recommendations to clients. For example, I worked on a survey that asked respondents how many cars they own.  Anyone who provided a number that was three standard deviations above the mean was set as an outlier because their answers would’ve significantly impacted our interpretation of the average car ownershipthe reality is the average household owns two cars, not six.
  • Straightliners are respondents who answer a battery of questions on the same scale with the same response. Because of this, sometimes we’ll see someone who strongly agrees or disagrees with two completely opposing statements—making it difficult to trust these answers reflect the respondent’s real opinion.
  • We often insert a Red Herring Fail into our questionnaires to help identify and weed out distracted respondents. A Red Herring Fail is a 10-point scale question usually placed around the halfway mark of a questionnaire that simply asks respondents to select the number “3” on the scale. If they select a number other than “3”, we flag them for removal.
  • If there’s incentive to participate in a questionnaire, someone may feel inclined to participate more than once. So to ensure our completed surveys are from unique individuals, we check for duplicate IP addresses and respondent IDs.

There are a lot of variables that can skew our data, so our cleaning process is thorough and thoughtful. And while the process may be cumbersome, here’s why we clean data: 

  • Impression on the clientFollowing a detailed data cleaning processes helps show that your team is cautious, thoughtful, and able to accurately dissect and digest large amounts of data. This demonstration of thoroughness and competency goes a long way to building trust in the researcher/client relationship because the client will see their researchers are working to present the best data possible.
  • Helps tell a better storyWe pride ourselves on storytelling–using insights from data and turning them into strong deliverablesto help our clients make strategic business decisions. If we didn’t have accurate and clean data, we wouldn’t be able to tell a good story!
  • Overall, ensures high quality and precise dataAt CMB typically two or more researchers are working on the same data file to mitigate the chance of error. The data undergoes such scrutiny so that any issues or mistakes can be noted and rectified, ensuring the integrity of the report.

The benefits of taking the time to clean our data far outweigh the risks of skipping it. Data cleaning keeps false or unrepresentative information from influencing our analyses or recommendations to a client and ensures our sample accurately reflects the population of interest.

So this spring, while you’re finally putting away those holiday decorations, remember that data cleaning is an essential step in maintaining the integrity of your work.

Nicole Battaglia is an Associate Researcher at CMB who prefers cleaning data over cleaning her bedroom.

Topics: data collection, quantitative research

A Lesson in Storytelling from the NFL MVP Race

Posted by Jen Golden

Thu, Feb 02, 2017

american football.jpg

There’s always a lot of debate in the weeks leading up to the NFL’s announcement of its regular season MVP. While the recipient is often from a team with a strong regular season record, it’s not always that simple. Of course the MVP's season stats are an important factor in who comes out on top, but a good story also influences the outcome. 

Take this year, we have a few excellent contenders for the crown, including…

  • Ezekiel Elliot, the rookie running back on the Dallas Cowboys
  • Tom Brady, the NE Patriots QB coming back from a four game “Deflategate” suspension
  • Matt Ryan, the Atlanta Falcons veteran “nice-guy” QB having a career year

Ultimately, deciding the winner is a mix of art and science. And while you’re probably wondering what this has to do with market research, the NFL regular season MVP selection process has a few important things in common with the creation of a good report. [Twitter bird-1.pngTweet this!]

First, make a framework: Having a framework for your research project can help keep you from feeling overwhelmed by the amount of data in front of you. In the MVP race, for example, voters should start by listing attributes they think make an MVP: team record, individual record, strength of schedule, etc. These attributes are a good way to narrow down potential candidates. In research, the framework might include laying out the business objectives and the data available for each. This outline helps focus the narrative and guide the story’s structure.

Then, look at the whole picture: Once the data is compiled, take a step back and think about how the pieces relate to one another and the context of each. Let’s look at Tom Brady’s regular season stats as an example. He lags behind league leaders on total passing yards and TDs, but remember that he missed four games with a suspension. When the regular season is only 12 games, missing a quarter of those was a missed opportunity to garner points, so you can’t help but wonder if it’s a fair comparison to make. Here’s where it’s important to look at the whole picture (whether we’re talking about research or MVP picks). If you don’t have the entire context, you could dismiss Brady altogether. In research, a meaningful story builds on all the primary data within larger social, political, and/or business contexts.

Finally, back it up with facts:  Once the pieces have come together, you need to back up your key storyline (or MVP pick) with facts to prove your credibility. For example, someone could vote for Giants wide receiver Odell Beckham Jr. because of an impressive once-in-a-lifetime catch he made during the regular season. But beyond the catch there wouldn’t be much data to support that he was more deserving than the other candidates. In a research report, you must support your story with solid data and evidence.  The predictions will continue until the 2016 regular season MVP is named, but whoever that ends up being, he will have a strong story and the stats to back it up.

 Jen is a Sr. PM on the Technology/E-commerce team. She hopes Tom Brady will take the MVP crown to silence his “Deflategate” critics – what a story that would be.

Topics: data collection, storytelling, marketing science