WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

How Advanced Analytics Saved My Commute

Posted by Laura Dulude

Wed, Aug 22, 2018

commuter

I don’t like commuting. Most people don’t. If you analyzed the emotions that commuting evokes, you’d probably hear commuters say it made them: frustrated, tired, and bored. To be fair, my commute experience isn’t as bad as it could be: I take a ~20-minute ride into Boston on the Orange Line, plus some walking before and after.

Still, wanting to minimize my discomfort during my time on the train and because I am who I am, I tracked my morning commute for about 10 months. I logged the number of other people waiting on the platform, number of minutes until the next train, time spent on the train, delays announced, the weather, and several other factors I thought might be related to a negative experience.

Ultimately, I decided the most frustrating part about my commute is how crowded the train is—the less crowded I am, the happier I feel. So, I decided to predict my subjective crowd rating for each day using other variables in my commuting dataset.

In this example, I’ve used a TreeNet analysis. TreeNet is the type of driver modeling we do most often at CMB because it’s flexible, allows you to include categorical predictors without creating dummy variables, handles missing data without much pre-processing, resists outliers, and does better with correlated independent variables than other techniques do.

TreeNet scores are shown in comparison to each other. The most important input will always be 100, and every other independent variable is scaled relative to that top variable. So, as you see in Figure 1, the time I board the train and the day of the week are about half as important as the number of people on the platform when I board. That means that as it turns out, I probably can’t do all that much to affect my commute, but I can at least know when it’ll be particularly unpleasant.

Importance to Crowding_commuter

What this importance chart doesn’t tell you is the relationship each item has to the dependent variable. For example, which weekdays have lower vs. higher crowding? Per-variable charts give us more information:

Weekday and Crowding_commuter

Figure 2 indicates that crowding lessens as the week goes on. Perhaps people are switching to ride-sharing services or working from home those days.

For continuous variables, like boarding time, we can explore the relationships through line charts:

Boarding Time and Crowding_commuter

Looks like I should get up on the earlier side if I want to have the best commuting experience! Need to tackle a thornier issue than your morning commute? Our Advanced Analytics team is the best in the business—contact us and let’s talk about how we can help!

 Laura Dulude is a data nerd and a grumpy commuter who just wants to get to work.

Topics: advanced analytics, EMPACT, emotional measurement, data visualization

Predicting Olympic Gold

Posted by Jen Golden

Wed, Feb 21, 2018

bobsled-team-run-olympics-38631.jpg

From dangerous winds and curling scandals to wardrobe malfunctions, there’s been no shortage of attention-grabbing headlines at the 2018 Winter Olympics.

And for ardent supporters of Team USA, the big story is America’s lagging medal count. We’re over halfway through the games, and currently the US sits in fifth place behind Norway, Germany, Canada, and the Netherlands.

Based on last week’s performance (and Mikaela Shiffrin’s recent withdrawal from the women’s downhill event), it’s hard to know for sure how America will place. However, we can use predictive analytics to determine the main predictors of medal count to anticipate which countries will generally be on the podium.

We’ll use TreeNet modeling to identify the main drivers of medal count based on previous Winter Olympics outcomes. For the sake of simplicity, we’ll focus on the 2014 Sochi winter games (excluding all Russia data which would skew the model!) From there, we can infer similarities between medal drivers for Sochi and PyeongChang.

Please note all these results are hypothetical, and not reflective of actual data!

To successfully run a TreeNet analysis, you need both a dependent variable (e.g., the outcome you are trying to predict) and independent variables (e.g., the input that could be possible predictors of the dependent variable).

In this case…

Dependent variable: Total 2014 Sochi Winter Games medal count
Independent variables (including data both directly related to the Olympics and otherwise):

  • Medal count at the Vancouver Olympic games
  • Medal count at previous Winter Games (all time)
  • Number of athletes participating
  • Number of events participating in
  • Number of outdoor events participating in (e.g., downhill skiing, bobsled)
  • Number of indoor events participating in (e.g., figure skating, curling)
  • Average country temperature
  • Average country yearly snowfall
  • Country population
  • Country GDP per capita

The Results!

Our model shows the relative importance of each variable calibrated to a 100-point scale. The most important variable is assigned a score of 100 while all other variables are scaled relative to that:

Olympic Medal Predictors.png

Meaning, in this sample output, previous medal history is the top predictor of Olympic medal outcome with a score of 100 while # in outdoor events and indoor events participating in are the least predictive.

This is a fun and simple example of how we could use TreeNet to forecast the Winter Olympic medal count. But, we also leverage this same technique to help clients predict the outcomes of some of their most complex and challenging questions. We can help predict things like consideration, satisfaction or purchase intent for example, and use the model to point to which levers can be pulled to help improve the outcome.  

Jen is a Sr. Project Manager at CMB who was a spectator at the Sochi winter games and wishes she was in PyeongChang right now.

Topics: predictive analytics, advanced analytics

CMB's Advanced Analytics Team Receives Children's Trust Partnership Award

Posted by Megan McManaman

Wed, Nov 01, 2017

jAYct.jpg

We're proud to announce that CMB’s VP of Advanced Analytics, Dr. Jay Weiner and Senior Analyst, Liz White, were honored with the Children’s Trust’s Partnership Award. Presented annually, the award recognizes the organizations and people whose work directly impact the organization's mission–stopping child abuse.

Jay and Liz were recognized for their work helping the Children’s Trust identify the messaging that resonated with potential donors and program users. Through two studies leveraging CMB’s emotional impact analysis—EMPACT, Max Diff Scaling, concept testing, self-explicated conjoint, and a highlighter exercise, the CMB team the Children's Trust identify the most appealing and compelling messaging.

“There is no one more deserving of this award than the team at CMB,” said Children’s Trust’s Executive Director, Suzin Bartley. “The messaging guidance CMB provided has been invaluable in helping us realize our mission to prevent child abuse in Massachusetts.”

Giving back to our community is part of our DNA of CMB and we’re honored to support the Children’s Trust’s mission to stop child abuse in Massachusetts. Click here to learn more about how the Children’s Trust provides families with programs and services to help them build the skills and confidence they need to make sure kids have safe and healthy childhoods.

From partnering with the Children’s Trust and volunteering at Boston’s St. Francis House to participating in the Leukemia & Lymphoma Society’s annual Light the Night walk, we have a longstanding commitment to serving our community. Learn more about CMB in the community here.

 

 

Topics: Community Involvement, advanced analytics, predictive analytics

Does your metric have a home(plate)?

Posted by Youme Yai

Thu, Sep 28, 2017

baseball.jpeg

Last month I attended a Red Sox/Yankees matchup at Fenway Park. By the seventh inning, the Sox had already cycled through seven pitchers. Fans were starting to lose patience and one guy even jumped on the field for entertainment. While others were losing interest, I stayed engaged in the game—not because of the action that was (not) unfolding, but because of the game statistics.

Statistics have been at the heart of baseball for as long as the sport’s been around. Few other sports track individual and team stats with such precision and detail (I suggest reading Michael Lewis’ Moneyball if you haven’t already). As a spectator, you know exactly what’s happening at all times, and this is one of my favorite things about baseball. As much as I enjoy watching the hits, runs, steals, strikes, etc., unfold on the field, it’s equally fun to watch those plays translate into statistics—witnessing the rise and fall of individual players and teams.

Traditionally batting average (# of hits divided by number of at bats) and earned run average (# of earned runs allowed by a pitcher per nine innings) have dominated the statistical world of baseball, but there are SO many others recorded. There’s RBI (runs batted in), OPS (on-base plus slugging), ISO (isolated power: raw power of a hitter by counting only extra-base hits and type of hit), FIP (fielding independent pitching: similar to ERA but focuses solely on pitching, and removes results on balls hit into field of play), and even xFIP (expected fielding independent pitching; or in layman’s term: how a pitcher performs independent of how his teammates perform once the ball is in play, but also accounting for home runs given up vs. home run average in league). And that's just the tip of the iceberg. 

With all this data, sabermetrics can yield some unwieldy metrics that have little applicability or predictive power. And sometimes we see this happen in market research. There are times when we are asked to collect hard-to-justify variables in our studies. While it seems sensible to gather as much information as possible, there’s such a thing as “too much” where it starts to dilute the goal and clarity of the project.  

So, I’ll take off my baseball cap and put on my researcher’s hat for this: as you develop your questionnaire, evaluate whether a metric is a “nice to have” or a “need to have.” Here are some things to keep in mind as you evaluate your metrics:

  1. Determine the overall business objective: What is the business question I am looking to answer based on this research? Keep reminding yourself of this objective.
  2. Identify the hypothesis (or hypotheses) that make up the objective: What are the preconceived notions that will lead to an informed business decision?
  3. Establish the pieces of information to prove or disprove the hypothesis: What data do I need to verify the assumption, or invalidate it?
  4. Assess if your metrics align to the information necessary to prove or disprove one or more of your identified hypotheses.

If your metric doesn’t have a home (plate) in one of the hypotheses, then discard it or turn it into one that does. Following this list can make the difference in accumulating a lot of data that produces no actionable results, or one that meets your initial business goal.

Combing through unnecessary data points is cumbersome and costly, so be judicious with your red pen in striking out useless questions. Don’t get bogged down with information if it isn’t directly helping achieve your business goal. Here at CMB, we partner with clients to minimize this effect and help meet study objectives starting well before the data collection stage.

Youme Yai is a Project Manager at CMB who believes a summer evening at the ballpark is second to none.

 

Topics: advanced analytics, predictive analytics, data collection

Sugar Overload: Dashboards that Yield Insights Not Headaches

Posted by Blair Bailey

Thu, Jun 29, 2017

simple froyo.png

Back in the old days (2002?) if you wanted a frozen treat—you ordered from the nice person at the TCBY counter, paid your money, and went on your way. Then Red Mango came to town and it was a game-changer. Now instead of someone else building my treat, I had total control—if I wanted to mix mango and coffee and throw some gummy bears on top I couldI didn’t though, I’m not a monster.

Of course, there was a downside—sometimes I’d walk away with a $15 froyo. Sometimes, there is such a thing as “too much of a good thing”. As a data manager, knee-deep in interactive data viz, I know this applies to dashboards as well as dessert. 

When starting a dashboard from scratch, there’s the same potential to go overboard, but for different reasons. Like flavors and toppings, there are many viewer design and build directions I could take. Will the dashboard be one centralized page or across multiple pages? What types of charts and tables should I use? What cuts should be columns and which should be filters?

The popular platform, Tableau, has so many options that it can often feel overwhelming. And aside from design, Tableau lets users deep dive into data like never before. With so many build options and data mining capabilities at our fingertips, what’s a designer to do?

Forget the gumdrops and jalapeño flavored yogurt—I encourage our clients to go back to basics and ask:

Who is the dashboard for? The content and design of a well-made dashboard should depend on its purpose and end user. The dashboards I create in my spare time (yes, it’s also a hobby !) are very different than the ones I build for clients.  For example, a deep-in-the-weeds analyst will need (and appreciate) very different functionality and design than a C-suite level user would. An analyst interested in deep-dives may need multiple filters and complex tables to cut the data every which way and investigate multiple scenarios, whereas a c-suite level needs a dashboard that answers their questions quickly and directly so they can move forward with business decisions.

It may be tempting to add flashy charts and lots of filters, but is it necessary? Will adding features help answer key business questions and empower the end user, or will it overwhelm and confuse them?

Here's a snippet from a dashboard that an executive could glean a good amount of insight from without feeling overwhelmed:

AffinID sample_simple.jpg

What will they use it for? Depending on what business questions the client is trying to answer, the design around specific types of dashboards may vary. For example, a brand health tracker dashboard could be a simple set of trending line charts and callouts for KPIs. But it’s rare that we only want to monitor brand health. Maybe the client is also interested in reaching a particular audience. So as the designer, I'll consider building the audiences in as a filter. Perhaps they want to expand into a new market. Divide your line charts by region and track performance across markets. Or maybe they need to track several measures over time across multiple brands, so rather than clog up the dashboard with lots of charts or tabs, you could use parameters to allow the user to toggle the main metric shown.

When in doubt, ask. When I plan to build and ultimately publish a dashboard to Tableau Public, I consider what elements will keep the user engaged and interested. If I’m not sure of the answers I force politely ask my friends, family, or co-workers to test out my dashboards and provide honest feedback. If my dashboard is confusing, boring, too simple, too convoluted, awesome, or just lame, I want to know. The same goes for client-facing dashboards.

As a data manger, my goal is to create engaging, useful data visualizations. But without considering who my end user is and their goal, this is nearly impossible. Tableau can build Pareto charts, heat maps, and filters, but if it doesn’t help answer key business questions in an intuitive and useful way, then what’s the point of having the data viz?

Just because you can mix mango and coffee together (and even add those gummy bears on top), doesn’t mean you should. Like TCBY and Red Mango with their flavors and toppings, Tableau offers infinite data viz possibilities—the key is to use the right ingredients so you aren’t left with a stomachache (or a headache).

Blair Bailey is a Data Manager at CMB with a focus on building engaging dashboards to inform key business decisions and empower stakeholders. Her personal dashboards? Less so.

Topics: data visualization, advanced analytics, integrated data