WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

Why Standing up for the Census Still Counts

Posted by Athena Rodriguez

Wed, Nov 07, 2018

busy city street

Over a year ago, I wrote about the critical state of the U.S. Census. To recap: to stay within budget, the US Census Bureau planned to add online and phone data collection to the traditional mail and face-to-face fielding. As any good researchers would, they planned to test this new mix of methodologies using a series a field tests and an end-to-end test. 

After cancelling several field tests earlier this year, last month the bureau completed an end-to-end test in Providence County, RI, and are “ready to transition from a paper-based census to one where people can respond online using personal computers or mobile devices, by telephone through our questionnaire assistance centers or by using the traditional paper-response option.” Click here for an infographic with all the details.

Whew, right?  Not so fast—there’s still another problem. A big one.

Against the recommendation of the Census Bureau, the Secretary of Commerce, Wilbur Ross, is fighting to add a citizenship question to the 2020 Census.

In a memo sent to the DOJ, the bureau’s Chief Scientist and Associate Director for Research and Methodology, John Abowd, wrote the inclusion of a citizenship question would be "very costly, harms the quality of the census count, and would use substantially less accurate citizenship status data than are available from administrative sources.”

In response, opponents of the question, including the state of California, New York, and the American Civil Liberties Union, have filed lawsuits against the Federal Government—echoing Abowd’s fears that the citizen question would discourage participation and compromise the integrity of the census.

Despite a request by the Federal Government to postpone, the trial began on Monday, November 5, in New York City, and is expected to last two weeks.

As I wrote in my earlier blog, the US Census is critical to market research. It serves as the foundation for things like sampling plans, weighting data, sizing audiences, and determining who to target.

If the citizen question goes through, it may deter non-citizens from participating. This would seriously harm the quality of the data and pose a threat to the integrity of our industry—not to mention impact federal budgeting and the number of House seats.

As market researchers, it’s our duty to preserve the integrity of the US Census. Whether you support or oppose the citizenship question, I encourage you to pay close attention to how the decision plays out. We’re still a year away from the census, but what’s decided now could have far-reaching ramifications for our industry and country.

Athena is a Project Director at CMB who really hopes the next time she blogs it will be about a satisfactory resolution to this ongoing issue.

Topics: Market research, data collection

Does your metric have a home(plate)?

Posted by Youme Yai

Thu, Sep 28, 2017

baseball.jpeg

Last month I attended a Red Sox/Yankees matchup at Fenway Park. By the seventh inning, the Sox had already cycled through seven pitchers. Fans were starting to lose patience and one guy even jumped on the field for entertainment. While others were losing interest, I stayed engaged in the game—not because of the action that was (not) unfolding, but because of the game statistics.

Statistics have been at the heart of baseball for as long as the sport’s been around. Few other sports track individual and team stats with such precision and detail (I suggest reading Michael Lewis’ Moneyball if you haven’t already). As a spectator, you know exactly what’s happening at all times, and this is one of my favorite things about baseball. As much as I enjoy watching the hits, runs, steals, strikes, etc., unfold on the field, it’s equally fun to watch those plays translate into statistics—witnessing the rise and fall of individual players and teams.

Traditionally batting average (# of hits divided by number of at bats) and earned run average (# of earned runs allowed by a pitcher per nine innings) have dominated the statistical world of baseball, but there are SO many others recorded. There’s RBI (runs batted in), OPS (on-base plus slugging), ISO (isolated power: raw power of a hitter by counting only extra-base hits and type of hit), FIP (fielding independent pitching: similar to ERA but focuses solely on pitching, and removes results on balls hit into field of play), and even xFIP (expected fielding independent pitching; or in layman’s term: how a pitcher performs independent of how his teammates perform once the ball is in play, but also accounting for home runs given up vs. home run average in league). And that's just the tip of the iceberg. 

With all this data, sabermetrics can yield some unwieldy metrics that have little applicability or predictive power. And sometimes we see this happen in market research. There are times when we are asked to collect hard-to-justify variables in our studies. While it seems sensible to gather as much information as possible, there’s such a thing as “too much” where it starts to dilute the goal and clarity of the project.  

So, I’ll take off my baseball cap and put on my researcher’s hat for this: as you develop your questionnaire, evaluate whether a metric is a “nice to have” or a “need to have.” Here are some things to keep in mind as you evaluate your metrics:

  1. Determine the overall business objective: What is the business question I am looking to answer based on this research? Keep reminding yourself of this objective.
  2. Identify the hypothesis (or hypotheses) that make up the objective: What are the preconceived notions that will lead to an informed business decision?
  3. Establish the pieces of information to prove or disprove the hypothesis: What data do I need to verify the assumption, or invalidate it?
  4. Assess if your metrics align to the information necessary to prove or disprove one or more of your identified hypotheses.

If your metric doesn’t have a home (plate) in one of the hypotheses, then discard it or turn it into one that does. Following this list can make the difference in accumulating a lot of data that produces no actionable results, or one that meets your initial business goal.

Combing through unnecessary data points is cumbersome and costly, so be judicious with your red pen in striking out useless questions. Don’t get bogged down with information if it isn’t directly helping achieve your business goal. Here at CMB, we partner with clients to minimize this effect and help meet study objectives starting well before the data collection stage.

Youme Yai is a Project Manager at CMB who believes a summer evening at the ballpark is second to none.

 

Topics: advanced analytics, predictive analytics, data collection

Dear Dr. Jay: How To Predict Customer Turnover When Transactions are Anonymous

Posted by Dr. Jay Weiner

Wed, Apr 26, 2017

Dear Dr. Jay:

What's the best way to estimate customer turnover for a service business whose customer transactions are usually anonymous?

-Ian S.


Dear Ian,

You have posed an interesting question.  My first response was, “you can’t”. But as I think about it some more, you might already have some data in-house that could be helpful in addressing the issue.DRJAY-9-2 (1).png

It appears you are in the mass transit industry. Most transit companies offer single ride fares and monthly passes while companies in college towns often offer semester-long passes. Since oftentimes the passes (monthly, semester, etc.) are sold at a discounted rate, we might conclude that all the single fare revenues are turnover transactions.

This assumption is a small leap of faith as I’m sure some folks just pay the single fare price and ride regularly. Let’s consider my boss. He travels a fair amount and even with the discounted monthly pass, it’s often cheaper for him to pay the single ride fare. Me, I like the convenience of not having to make sure I have the correct fare in my pocket so I just pay the monthly rate, even if I don’t use it every day. We both might be candidates for weekly pass sales if we planned for those weeks when we know we’d be commuting every day versus working from home or traveling. I suspect the only way to get at that dimension would be to conduct some primary research to determine the frequency of ridership and how folks pay.

For your student passes, you probably have enough historic data in-house to compare your average semester pass sales to the population of students using them and can figure out if you see turnover in those sales. That leaves you needing to estimate the turnover on your monthly pass sales.

You also may have corporate sales that you could look at. For example, here at CMB, employees can purchase their monthly transit passes through our human resources department. Each month our cards are automatically updated so that we don’t have to worry about renewing it every few weeks.  I suspect if we analyzed the monthly sales from our transit system (MTBA) to CMB, we could determine the turnover rate.

As you can see, you could already have valuable data in-house that can help shed light on customer turnover. I’m happy to look at any information you have and let you know what options you might have in trying to answer your question.

Dr. Jay is CMB’s Chief Methodologist and VP of Advanced Analytics and holds a Zone 3 monthly pass to the MTBA.  If it wasn’t for the engineer, he wouldn’t make it to South Station every morning.

Keep those questions coming! Ask Dr. Jay directly at DearDrJay@cmbinfo.com or submit your question anonymously by clicking below:

Ask Dr. Jay!

Topics: Dear Dr. Jay, data collection, advanced analytics

If you can’t trust your sample sources, you can’t trust your data

Posted by Jared Huizenga

Wed, Apr 19, 2017

people with word bubbles-2.jpgDuring a recent data collection orientation for new CMB employees, someone asked me how we select the online sample providers we work with on a regular basis. Each week, my Field Services team receives multiple requests from sample providers—some we know from conferences, others from what we’ve read in industry publications, and some that are entirely new to us.

When vetting new sample providers, a good place to start is the ESOMAR 28 Questions to Help Buyers of Online Samples. Per the site, these questions “help research buyers think about issues related to online samples.”

An online sample provider should be able to answer the ESOMAR 28 questions; consider red flagging any that won’t. If their answers are too brief and don’t provide much insight into their procedures, it’s okay to ask them for more information, or just move along to the next. 

While all 28 questions are valuable, here are a few that I pay close attention to:

Please describe and explain the type(s) of online sample sources from which you get respondents. Are these databases?  Actively managed research panels?  Direct marketing lists?  Social networks?  Web intercept (also known as river) samples?  

Many online sample providers use multiple methods, so these options aren’t always exclusive. I’m a firm believer in knowing where the sample is coming from, but there isn’t necessarily one “right” answer to this question. Depending on the project and the population you are looking for, different methods may need to be used to get the desired results.

Are your sample source(s) used solely for market research? If not, what other purposes are they used for? 

Beware of providers that use sample sources for non-research purposes. If a provider states that they are using their sample for something other than research, at the very least you should probe them for more details so that you feel comfortable in what those other purposes are. Otherwise, pass on the provider.

Do you employ a survey router? 

A survey router is software that directs potential respondents to a questionnaire for which they may qualify. There are pros and cons to survey routers, and they have become such a touchy subject that several of the ESOMAR 28 questions are devoted to the topic of routers. I’m not a big fan of survey routers, since they can be easily abused by dishonest respondents. If a company uses a survey router as part of their standard practice, be sure you have a very clear understanding of how the router is used as well as any restrictions they place on router usage.

You should also be wary of any sample provider who tells you that your quality control (QC) measures are too strict. This happened to me a few years ago and, needless to say, it ended our relationship with the company. This is not to say that QC measures can’t be too restrictive, and in those cases you can actually be throwing out good data.

At CMB, we did a lot of research prior to implementing our QC standards.  We consulted peers and sample providers to get a good understanding of what was fair and reasonable in the market. We investigated speeding criteria, red herring options, and how to look at open-ended responses. We revisit these standards on a regular basis to make sure they are still relevant. 

Since each of our tried and true providers support our QC standards, when a new (to us) sample provider tells us we’re rejecting too many of their panelists due to poor quality, you can understand how that raises a red flag. Legitimate sample providers will appreciate the feedback on “bad” respondents because it helps them to improve the quality of their sample.

There are tons of online sample providers in the marketplace, but not every partner is a good fit for everyone. While I won’t make specific recommendations, I urge you to consider the three questions I referenced above when selecting your partner.

At Chadwick Martin Bailey, we’ve worked hard to establish trusted relationships with a handful of online sample providers. They’re dedicated to delivering high quality sample and have a true “partnership” mentality. 

In my world of data collection, recommending the best sample providers to my internal clients is extremely important. This is key to providing our clients with sound insights and recommendations that support confident, strategic decision-making. 

Jared Huizenga is CMB’s Field Services Director, and has been in market research industry for nineteen years. When he isn’t enjoying the exciting world of data collection, he can be found competing at barbecue contests as the pitmaster of the team Insane Swine BBQ.

 

 

Topics: data collection, methodology

Spring into Data Cleaning

Posted by Nicole Battaglia

Tue, Apr 04, 2017

scrubbing.jpegWhen someone hears “spring cleaning” they probably think of organizing their garage, purging clothes from their closet, and decluttering their workspace. For many, spring is a chance to refresh and rejuvenate after a long winter (fortunately ours in Boston was pretty mild).

This may be my inner market researcher talking, but when I think of spring cleaning, the first that comes to mind is data cleaning. Like cleaning and organizing your home, data cleaning is a detailed and lengthy process that is relevant to researchers and their clients.

Data cleaning is an arduous task. Each completed questionnaire must be checked to ensure that it's been answered correctly, clearly, truthfully, and consistently. Here’s what we typically clean:

  • We’ll look at each open-ended response in a survey to make sure respondents’ answers are coherent and appropriate. Sometimes respondents will curse, other times they'll write outrageously irrelevant answers like what they’re having for dinner, so we monitor these closely. We do the same for open-ended numeric responsesthere’s always that one respondent who enters ‘50’ when asked how many siblings they have.
  • We also check for outliers in open-ended numeric responses. Whether it’s false data or an exceptional respondent (e.g. Bill Gates), outliers can skew our data and lead us to draw the wrong conclusions and make more recommendations to clients. For example, I worked on a survey that asked respondents how many cars they own.  Anyone who provided a number that was three standard deviations above the mean was set as an outlier because their answers would’ve significantly impacted our interpretation of the average car ownershipthe reality is the average household owns two cars, not six.
  • Straightliners are respondents who answer a battery of questions on the same scale with the same response. Because of this, sometimes we’ll see someone who strongly agrees or disagrees with two completely opposing statements—making it difficult to trust these answers reflect the respondent’s real opinion.
  • We often insert a Red Herring Fail into our questionnaires to help identify and weed out distracted respondents. A Red Herring Fail is a 10-point scale question usually placed around the halfway mark of a questionnaire that simply asks respondents to select the number “3” on the scale. If they select a number other than “3”, we flag them for removal.
  • If there’s incentive to participate in a questionnaire, someone may feel inclined to participate more than once. So to ensure our completed surveys are from unique individuals, we check for duplicate IP addresses and respondent IDs.

There are a lot of variables that can skew our data, so our cleaning process is thorough and thoughtful. And while the process may be cumbersome, here’s why we clean data: 

  • Impression on the clientFollowing a detailed data cleaning processes helps show that your team is cautious, thoughtful, and able to accurately dissect and digest large amounts of data. This demonstration of thoroughness and competency goes a long way to building trust in the researcher/client relationship because the client will see their researchers are working to present the best data possible.
  • Helps tell a better storyWe pride ourselves on storytelling–using insights from data and turning them into strong deliverablesto help our clients make strategic business decisions. If we didn’t have accurate and clean data, we wouldn’t be able to tell a good story!
  • Overall, ensures high quality and precise dataAt CMB typically two or more researchers are working on the same data file to mitigate the chance of error. The data undergoes such scrutiny so that any issues or mistakes can be noted and rectified, ensuring the integrity of the report.

The benefits of taking the time to clean our data far outweigh the risks of skipping it. Data cleaning keeps false or unrepresentative information from influencing our analyses or recommendations to a client and ensures our sample accurately reflects the population of interest.

So this spring, while you’re finally putting away those holiday decorations, remember that data cleaning is an essential step in maintaining the integrity of your work.

Nicole Battaglia is an Associate Researcher at CMB who prefers cleaning data over cleaning her bedroom.

Topics: data collection, quantitative research