WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

My Data Quality Obsession

Posted by Laurie McCarthy on Tue, Jan 12, 2016

3d_people_in_a_row.jpgYesterday I got at least 50 emails, and that doesn’t include what went to my spam folder—at least half of those went straight in the trash. So, I know what a challenge it is to get a potential respondent to even open an email that contains a questionnaire link. We’re always striving to discover and implement new ways to reach respondents and to keep them engaged: mobile optimization is key, but we also consider incentive levels and types, subject lines, and, of course, better ways to ask questions like highlighter exercises, sliding scales, interactive web simulations, and heat maps. This project customization also provides us with the flexibility needed to communicate with respondents in hard-to-reach groups.

Once we’ve got those precious respondents, the question remains: are we reaching the RIGHT respondents and keeping them engaged? How can we evaluate the data efficiently prior to any analysis?

Even with the increased methods in place to protect against “bad”/professional respondents, the data quality control process remains an important aspect of each project. We have set standards in place, starting in the programming phase—as well as during the final review of the data—to identify and eliminate “bad” respondents from the data prior to conducting any analysis.

We start from a conservative standpoint during programming, flagging respondents who fail any of the criteria in the list below. These respondents are not permanently removed from the data at this point, but they are categorized as an incomplete and are reviewable if we feel that they provide value to the study:

  • “Speedsters”Respondents who completed the questionnaire in 1/5 of the overall median time or less. This is applied to evaluate the data collected after approximately the first 20% or 100 completes, whichever is first.
  • “Grid Speedsters”:When applicable, respondents who, for two or more grids of ten or more items, has a grid speed less than 2 standard deviations from the mean for the grid. Again, this is applied after approximately the first 20% or 100 completes, whichever is first.
  • “Red-Herring”We incorporate a standard scale question (0-10), which is programmed at or around the estimated 10-minute mark in the questionnaire, asking the respondent to select a number on the scale. Respondents who do not select the appropriate number are flagged.

This process allows us to begin the data quality review during fielding, so that the blatantly “bad” respondents are removed prior to close of data collection.

However, our process extends to the final data as well.  After the fielding is complete, we review the data for the following:

  • Duplicate respondents: Even with unique links and passwords (for online), we review the data based on the email/phone number provided and the IP Address to remove respondents who do not appear to be unique.
  • Additional speedsters: Respondents who completed the questionnaire in a short amount of time. We take into consideration any brand/product rotation as well (evaluating one brand/product would take less time than evaluating several brands/products). 
  • Straight-liners: Similar to the grid speeders above, we review respondents who have selected only one value for each attribute in a grid. We flag respondents who have straight-lined each grid to create a sum of “straight-liners.” We review this metric on its own as well as in conjunction with overall completion time. The rationale being that if respondents are only selecting one value throughout the questionnaire and appear in the straight-lining flag, these individuals will also have sped through the questionnaire.
  • Inconsistent response patterns: In grids, we can sometimes have attributes that would use the reverse scale, and we review those to determine if there are contradictory responses. Another example might be a respondent who indicates he/she uses a specific brand, and, later in the study, the respondent indicates that he/she is not aware of that brand.

While we may not eliminate respondents, we do examine other factors for “common sense”:

  • Gibberish verbatims: Random letters/symbols or references that do not pertain to the study across each open ended response
  • Demographic review: Review of the demographic information to ensure that they are reasonable and in line with the specifications of the study

As part of our continuing partnership with panel sample providers, we provide them with the panel ID and information of those respondents who have failed our quality control process. In some instances, in which the client or the analysis require that certain sample sizes are collected, this may also necessitate replacing bad respondents. Our collaboration allows us to stand behind the quality of the respondents we provide for analysis and reporting, while also meeting the needs of our clients in a challenging environment.

Our clients rely on us to manage all aspects of data collection when we partner with them to develop a questionnaire, and our stringent data quality control process ensures that we can do that plus provide data that will support their business decisions. 

Laurie McCarthy is a Senior Data Manager at CMB. Though an avid fan of Excel formulas and solving data problems, she has never seen Star Wars. Live long and prosper.

We recently did a webinar on research we conducted in partnership with venture capital firm Foundation Capital This webinar will help you think about Millennials and their investing, including specific financial habits and the attitudinal drivers of their investing preferences.

Watch Here!

 

Topics: Chadwick Martin Bailey, methodology, data collection, quantitative research