WELCOME TO OUR BLOG!

The posts here represent the opinions of CMB employees and guests—not necessarily the company as a whole. 

Subscribe to Email Updates

BROWSE BY TAG

see all

Dr. Jay Weiner

Recent Posts

Relatability, Desirability and Finding the Perfect Match

Posted by Dr. Jay Weiner

Tue, Feb 13, 2018

hearts-cropped-1.jpg

Dear Dr. Jay:

It’s Valentine’s Day, how do I find the one ones that are right for me?

-Allison N.


Dear Allison,

In our pursuit of love, we’re often reminded to keep an open mind and that looks aren’t everything.

This axiom also applies to lovestruck marketers looking for the perfect customer. Often, we focus on consumer demographics, but let’s see what happens when we dig below the surface.

For example, let’s consider two men who:

  • Were born in 1948
  • Grew up in England
  • Are on their Second Marriage
  • Have 2 children
  • Are Successful in Business
  • Are Wealthy
  • Live in a Castle
  • Winter in the Alps
  • Like Dogs

On paper these men sound like they’d have very similar tastes in products and services–they are the same age, nationality, and have common interests. But when you learn who these men are, you might think differently.

The men I profiled are the Prince of Darkness, Ozzy Osbourne, and Prince Charles of Wales. While both men sport regal titles and an affinity for canines, they are very different individuals.

Now let’s consider two restaurants. Based on proprietary self-funded research, we discovered that both restaurants’ typical customers are considered Sporty, Athletic, Confident, Self-assured, Social, Outgoing, Funny, Entertaining, Relaxed, Easy-going, Fun-loving, and Joyful. Their top interests include: Entertainment (e.g., movies, TV) and dining out. Demographically their customers are predominately single, middle-aged men.

One is Buffalo Wild Wings, the other, Hooters. Both seem to appeal to the same group of consumers and would potentially be good candidates for cross-promotions—maybe even an acquisition.

What could we have done to help distinguish between them? Perhaps a more robust attitudinal battery of items or interests would have helped. 

Or, we could look through a social identity lens.

We found that in addition to assessing customer clarity, measuring relatability and desirability can help differentiate brands:

  • Relatability: How much do you have in common with the kind of person who typically uses Brand X?
  • Social Desirability: How interested would you be in making friends with the kind of person who typically uses Brand X?

When we looked at the scores on these two dimensions, we saw that Buffalo Wild Wings scores higher than Hooters:BWW vs hooters1.png

Meaning, while the typical Buffalo Wild Wings customer is demographically like a typical Hooters customer, the typical Hooters customer is less relatable and socially desirable.  This isn’t necessarily bad news for Hooters–it simply means that it has a more targeted niche appeal than Buffalo Wild Wings. 

The main point is that it helps to look beyond demographics and understand identity—who finds you relatable and desirable. As we see in the Buffalo Wild Wings and Hooters example, digging deeper into the dimensions of social identity can uncover more nuanced niches within a target audience—potentially uncovering your “perfect match”. 

Topics: Dear Dr. Jay, Identity, consumer psychology

'Twas the night before field close...

Posted by Dr. Jay Weiner

Wed, Dec 20, 2017

 

Wintry forest.png

 

'Twas the night before field close and all through the internet,

not a respondent was responding, not even a Female head of house.  

The questionnaires were posted on the net

with care in hopes that data soon would be there. 

 

The clients were nestled all snug in their beds,

while visions of PowerPoint decks danced in their heads. 

And the project team has just submitted the model request form.

 

When out in the office there arose such a clatter,

I sprang from my cube to see what was the matter. 

Away to the PC, I flew like a flash. 

Tore open the zip file

and through up the SPSS.   

 

The mean PI of the new concept gave a glimmer of hope

to the client service folks below. 

When, what to my wondering eyes should appear,

but a little multicollinearity and 8 respondents with missing data.

 

With the drivers unclear,

I knew in a moment it must be a Treenet. 

More rapid than correlations the importance scores came. 

And he whistled and shouted and called them by name...

 

Functional benefits, EMPACT and AffinID. 

To the top of the stack, to the top of the pile,

now for importance we understand all.

 

As bivariate plots against dependent variables show

when they meet with a knot, they bend on the fly. 

Non-linear obstacles we decry,

the relationship is clear, and now we know why. 

 

With the click of a SEND, the report deck went out. 

The clients were happy, the results made sense.

 But I heard them exclaim ere they stopped for the night.

 That’s just wave 1, wave 2 is coming so just sit tight.

 

Happy Holidays from all of us at CMB.

Dear Dr. Jay: How To Predict Customer Turnover When Transactions are Anonymous

Posted by Dr. Jay Weiner

Wed, Apr 26, 2017

Dear Dr. Jay:

What's the best way to estimate customer turnover for a service business whose customer transactions are usually anonymous?

-Ian S.


Dear Ian,

You have posed an interesting question.  My first response was, “you can’t”. But as I think about it some more, you might already have some data in-house that could be helpful in addressing the issue.DRJAY-9-2 (1).png

It appears you are in the mass transit industry. Most transit companies offer single ride fares and monthly passes while companies in college towns often offer semester-long passes. Since oftentimes the passes (monthly, semester, etc.) are sold at a discounted rate, we might conclude that all the single fare revenues are turnover transactions.

This assumption is a small leap of faith as I’m sure some folks just pay the single fare price and ride regularly. Let’s consider my boss. He travels a fair amount and even with the discounted monthly pass, it’s often cheaper for him to pay the single ride fare. Me, I like the convenience of not having to make sure I have the correct fare in my pocket so I just pay the monthly rate, even if I don’t use it every day. We both might be candidates for weekly pass sales if we planned for those weeks when we know we’d be commuting every day versus working from home or traveling. I suspect the only way to get at that dimension would be to conduct some primary research to determine the frequency of ridership and how folks pay.

For your student passes, you probably have enough historic data in-house to compare your average semester pass sales to the population of students using them and can figure out if you see turnover in those sales. That leaves you needing to estimate the turnover on your monthly pass sales.

You also may have corporate sales that you could look at. For example, here at CMB, employees can purchase their monthly transit passes through our human resources department. Each month our cards are automatically updated so that we don’t have to worry about renewing it every few weeks.  I suspect if we analyzed the monthly sales from our transit system (MTBA) to CMB, we could determine the turnover rate.

As you can see, you could already have valuable data in-house that can help shed light on customer turnover. I’m happy to look at any information you have and let you know what options you might have in trying to answer your question.

Dr. Jay is CMB’s Chief Methodologist and VP of Advanced Analytics and holds a Zone 3 monthly pass to the MTBA.  If it wasn’t for the engineer, he wouldn’t make it to South Station every morning.

Keep those questions coming! Ask Dr. Jay directly at DearDrJay@cmbinfo.com or submit your question anonymously by clicking below:

Ask Dr. Jay!

Topics: advanced analytics, data collection, Dear Dr. Jay

Dear Dr. Jay: HOW can we trust predictive models after the 2016 election?

Posted by Dr. Jay Weiner

Thu, Jan 12, 2017

Dear Dr. Jay,

After the 2016 election, how will I ever be able to trust predictive models again?

Alyssa


Dear Alyssa,

Data Happens!

Whether we’re talking about political polling or market research, to build good models, we need good inputs. Or as the old saying goes: “garbage in, garbage out”.  Let’s look at all the sources of error in the data itself:DRJAY-9-2.png

  • First, we make it too easy for respondents to say “yes” and “no” and they try to help us by guessing what answer we want to hear. For example, we ask for purchase intent to a new product idea. The respondent often overstates the true likelihood of buying the product.
  • Second, we give respondents perfect information. We create 100% awareness when we show the respondent a new product concept.  In reality, we know we will never achieve 100% awareness in the market.  There are some folks who live under a rock and of course, the client will never really spend enough money on advertising to even get close.
  • Third, the sample frame may not be truly representative of the population we hope to project to. This is one of the key issues in political polling because the population is comprised of those who actually voted (not registered voters).  For models to be correct, we need to predict which voters will actually show up to the polls and how they voted.  The good news in market research is that the population is usually not a moving target.

Now, let’s consider the sources of error in building predictive models.  The first step in building a predictive model is to specify the model.  If you’re a purist, you begin with a hypotheses, collect the data, test the hypotheses and draw conclusions.  If we fail to reject the null hypotheses, we should formulate a new hypotheses and collect new data.  What do we actually do?  We mine the data until we get significant results.  Why?  Because data collection is expensive.  One possible outcome from continuing to mine the data looking for a better model is a model that is only good at predicting the data you have and not too accurate in predicting the results using new inputs. 

It is up to the analyst to decide what is statistically meaningful versus what is managerially meaningful.  There are a number of websites where you can find “interesting” relationships in data.  Some examples of spurious correlations include:

  • Divorce rate in Maine and the per capita consumption of margarine
  • Number of people who die by becoming entangled in their bedsheets and the total revenue of US ski resorts
  • Per capita consumption of mozzarella cheese (US) and the number of civil engineering doctorates awarded (US)

In short, you can build a model that’s accurate but still wouldn’t be of any use (or make any sense) to your client. And the fact is, there’s always a certain amount of error in any model we build—we could be wrong, just by chance.  Ultimately, it’s up to the analyst to understand not only the tools and inputs they’re using but the business (or political) context.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

PS – Have you registered for our webinar yet!? Join Dr. Erica Carranza as she explains why to change what consumers think of your brand, you must change their image of the people who use it.

What: The Key to Consumer-Centricity: Your Brand User Image

When: February 1, 2017 @ 1PM EST

Register Now!

 

 

Topics: methodology, data collection, Dear Dr. Jay, predictive analytics

Dear Dr. Jay: Weighting Data?

Posted by Dr. Jay Weiner

Wed, Nov 16, 2016

Dear Dr. Jay:

How do I know if my weighting matrix is good? 

Dan


Dear Dan,DRJAY-9.png

I’m excited you asked me this because it’s one of my favorite questions of all time.

First we need to talk about why we weight data in the first place.  We weight data because our ending sample is not truly representative of the general population.  This misrepresentation can occur because of non-response bias, poor sample source and even bad sample design.  In my opinion, if you go into a research study knowing that you’ll end up weighting the data, there may be a better way to plan your sample frame. 

Case in point, many researchers intentionally over-quota certain segments and plan to weight these groups down in the final sample.  We do this because the incidence of some of these groups in the general population is small enough that if we rely on natural fallout we would not get a readable base without a very large sample.  Why wouldn’t you just pull a rep sample and then augment these subgroups?  The weight needed to add these augments into the rep sample is 0. 

Arguments for including these augments with a very small weight include the treatment of outliers.  For example, if we were conducting a study of investors and we wanted to include folks with more than $1,000,000 in assets, we might want to obtain insights from at least 100 of these folks.  In a rep sample of 500, we might only have 25 of them.  This means I need to augment this group by 75 respondents.  If somehow I manage to get Warren Buffet in my rep sample of 25, he might skew the results of the sample.  Weighting the full sample of 100 wealthier investors down to 25 will reduce the impact of any outlier.

A recent post by Nate Cohn in the New York Times suggested that weighting was significantly impacting analysts’ ability to predict the outcome of the 2016 presidential election.  In the article, Mr. Cohn points out, “there is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election.”  This man carried a sample weight of 30.  In a sample of 3000 respondents, he now accounts for 1% of the popular vote.  In a close race, that might just be enough to tip the scale one way or the other.  Clearly, he showed up on November 8th and cast the deciding ballot.

This real life example suggests that we might want to consider “capping” extreme weights so that we mitigate the potential for very small groups to influence overall results. But bear in mind that when we do this, our final sample profiles won’t be nationally representative because capping the weight understates the size of the segment being capped.  It’s a trade-off between a truly balanced sample and making sure that the survey results aren’t biased. [Tweet this!]

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

Keep the market research questions comin'! Ask Dr. Jay directly at DearDrJay@cmbinfo.com or submit yours anonymously by clicking below:

 Ask Dr. Jay!

Topics: methodology, Dear Dr. Jay

Dear Dr. Jay: When To Stat Test?

Posted by Dr. Jay Weiner

Wed, Oct 26, 2016

Dear Dr. Jay,

The debate over how and when to test for statistical significance comes up nearly every engagement. Why wouldn’t we just test everything?

-M.O. in Chicago


 DRJAY.pngHi M.O.-

You’re not alone. Many clients want all sorts of things stat tested. Some things can be tested while others can’t. But for what can be tested, as market researchers we need to be mindful of two potential errors in hypothesis testing. Type I errors are when we reject a true null hypothesis. For example, if we accept the claim that Coke tastes better than Pepsi, it’s erroneous because in fact, it’s not true.

A type II error occurs when we accept the null hypothesis when in fact it is false. This part is safe to install and then the plane crashes. We choose the probability of committing a type I error when we choose alpha (say .05). The probability of a type II error is a function of power. We seldom take this side of the equation into account for good reason. Most decisions we make in market research don’t come with a huge price tag if we’re wrong. Hardly anyone ever dies if the results of the study are wrong. The goal in any research is to minimize both types of errors. The best way to do that is to use a larger sample.

This conundrum perfectly illustrates my “Life is a conjoint” mantra. While testing we’re always trading off between the accuracy of the results with the cost of executing a study with a larger sample. Further, we also tend to violate the true nature of hypothesis testing. More often than not, we don’t formally state a hypothesis. Rather, we statistically test everything and then report the statistical differences.

Consider this: when we compare two scores, we accept that we might get a statistical difference of 5% of the time simply by chance (a=.05). This could be the difference in concept acceptance between males and females.

In fact, that’s not really what we do, we perform hundreds of tests in most every study. Let’s say we have five segments and we want to test them for differences in concept acceptance. That’s 10 t-tests. Now we have a 29% chance of flagging a difference simply due to chance. That’s in every row of our tables. The better test would be to run an analysis of variance on the table to determine if any cell might be different. Then build a hypothesis and test them one at a time. But we don’t do this because it takes too much time. I realize I’m not going to change the way our industry does things (I’ve been trying for years), but maybe, just maybe you’ll pause for a moment when looking at your tables to decide if this “statistical” significance is really worth reporting—are the results valid and are they useful?.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

Got a burning research question? You can send your questions to DearDrJay@cmbinfo.com or submit anonymously here:

Ask Dr. Jay!

 

 

Topics: advanced analytics, Dear Dr. Jay

Dear Dr. Jay: Driver Modeling

Posted by Dr. Jay Weiner

Thu, Jun 23, 2016

Dear Dr. Jay,

We want to assess the importance of fixing some of our customer touchpoints, what would you recommend as a modeling tool?

 -Alicia


Hi Alicia,

DRJAY.pngThere are a variety of tools we use to determine the relative importance of key variables on an outcome (dependent variable). Here’s the first question we need to address: are we trying to predict the actual value of the dependent variable or just assess the importance of any given independent variable in the equation? Most of the time, the goal is the latter.

Once we know the primary objective, there are three key criteria we need to address. The first is the amount of multicollinearity in our data. The more independent variables we have, the bigger problem this presents. The second is the stability in the model over time. In tracking studies, we want to believe that the differences between waves are due to actual differences in the market and not artifacts of the algorithm used to compute the importance scores. Finally, we need to understand the impact of sample size on the models.

How big a sample do you need? Typically, in consumer research, we see results stabilize with n=200. Some tools will do a better job with smaller samples than others. You should also consider the number of parameters you are trying to model. A grad school rule of thumb is that you need 4 observations for each parameter in the model, so if you have 25 independent variables, you’d need at least 100 respondents in your sample.

There are several tools to consider using to estimate relative importance: Bivariate Correlations, OLS, Shapley Value Regression (or Kruskal’s Relative Importance), TreeNet, and Bayesian Networks are all options. All of these tools will let you understand the relative importance of the independent variables in predicting your key measure. One think to note is that none of the tools specifically model causation. You would need some sort of experimental design to address that issue. Let’s break down the advantages and disadvantages of each. 

Bivariate Correlations (measures the strength of the relationship between two variables)

  • Advantages: Works with small samples. Relatively stable wave to wave. Easy to execute. Ignores multicollinearity.
  • Disadvantages: Only estimates the impact of one attribute at a time. Ignores any possible interactions. Doesn’t provide an “importance” score, but a “strength of relationship” value.  Assumes a linear relationship among the attributes. 

Ordinary Least Squares regression (OLS) (method for estimating the unknown parameters in a linear regression model)

  • Advantages: Easy to execute. Provides an equation to predict the change in the dependent variable based on changes in the independent variable (predictive analytics).
  • Disadvantages: Highly susceptible to multicollinearity, causing changes in key drivers in tracking studies. If the goal is a predictive model, this isn’t a serious problem. If your goal is to prioritize areas of improvement, this is a challenge. Assumes a linear relationship among the attributes. 

Shapley Value Regression or Kruskal’s Relative Importance

These are a couple of approaches that consider all possible combinations of explanatory variables. Unlike traditional regression tools, these techniques are not used for forecasting. In OLS, we predict the change in overall satisfaction for any given change in the independent variables. These tools are used to determine how much better the model is if we include any specific independent variable versus models that do not include that measure. The conclusions we draw from these models refer to the usefulness of including any measure in the model and not its specific impact on improving measures like overall satisfaction. 

  • Advantages: Works with smaller samples. Does a better job of dealing with multicollinearity. Very stable in predicting the impact of attributes between waves.
  • Disadvantages: Ignores interactions. Assumes a linear relationship among the attributes.

TreeNet (a tree-based data mining tool)

  • Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes.
  • Disadvantages: Requires a larger sample size—usually n=200 or more. 

Bayesian Networks (a graphical representation of the joint probabilities among key measures)

  • Advantages: Does a better job of dealing with multicollinearity than most linear models. Very stable in predicting the impact of attributes between waves. Can identify interactions. Does not assume a linear relationship among the attributes. Works with smaller samples. While a typical Bayes Net does not provide a system of equations, it is possible to simulate changes in the dependent variable based on changes to the independent variables.
  • Disadvantages: Can be more time-consuming and difficult to execute than the others listed here.

Got a burning research question? You can send your questions to DearDrJay@cmbinfo.com or submit anonymously here.

Dr. Jay Weiner is CMB’s senior methodologist and VP of Advanced Analytics. Jay earned his Ph.D. in Marketing/Research from the University of Texas at Arlington and regularly publishes and presents on topics, including conjoint, choice, and pricing.

Topics: advanced analytics, Dear Dr. Jay

Dear Dr. Jay: Discrete Choice—How Many Is Too Many Features?

Posted by Dr. Jay Weiner

Wed, Mar 23, 2016

Dear Dr. Jay,

I’m interested in testing a large number of features for inclusion in the next version of my product. My team is suggesting that we need to cull the list down to a smaller set of items to run a choice model. Are there ways to test a large set of attributes in a choice model?

-Nick


 DRJAY.pngHi Nick –

There are a number of ways to test a large set of attributes in choice modeling. Most of the time, when we test a large number of features, many are simply binary attributes (included/not included). While this makes the experimental design larger, it’s not quite as bad as having ten six-level attributes. If the description is short enough, you might go ahead and just include all of them. If you’re concerned about how much reading a respondent will need to do—or you really wouldn’t offer a respondent 12 additional perks for choosing your credit card—you could put a cap on the number of additional features any specific offer includes. For example, you could test 15 new features in a single model, but respondents would only get up to 5 at any single time. This is actually better than using a partial profile design as all respondents would see all offers. 

Another option is to do some sort of bridging study where you test all of the features using a max diff task. You can include a subset of the factors in a DCM and then use the max diff utilities to compute the utility for the full list of features in the DCM. This allows you to include the full set of features in your simulation tool.

Dr. Jay loves designing really big, complex choice models.  With over 20 years of DCM experience, he’s never met a design challenge he couldn’t solve. 

Topics: advanced analytics, product development, Dear Dr. Jay

Dear Dr. Jay—Brands Ask: Let's Stay Together?

Posted by Dr. Jay Weiner

Thu, Feb 11, 2016

 Dear Dr. Jay,

 What’s love got to do with it?

 -Tina T. 


DrJay_Thinking_about_love.pngHi Tina,

How timely.

The path to brand loyalty is often like the path to wedded bliss. You begin by evaluating tangible attributes to determine if the brand is the best fit for you. After repeated purchase occasions, you form an emotional bond to the brand that goes beyond those tangible attributes. As researchers, when we ask folks why they purchase a brand, they often reflect on performance attributes and mention those as drivers of purchase. But, to really understand the emotional bond, we need to ask how you feel when you interact with the brand.

We recently developed a way to measure this emotional bond (Net Positive Emotion Score - NPES). By asking folks how they felt on their most recent interaction, we’re able to determine respondents’ emotional bond with products. Typical regression tools indicate that the emotional attributes are about as predictive of future behavior as the functional benefits of the product. This leads us to believe that at some point in your pattern of consumption, you become bonded to the product and begin to act on emotion—rather than rational thoughts. Of course, that doesn’t mean you can’t rate the performance dimensions of the products you buy.

Loyalty is a behavior, and behaviors are often driven by underlying attitudinal measures. You might continue to purchase the same product over and over for a variety of reasons. In a perfect world, you not only create a behavioral commitment, but also an emotional bond with the brand and, ultimately, the company. Typically, we measure this path by looking at the various stages you go through when purchasing products. This path begins with awareness, evolves through familiarity and consideration, and ultimately ends with purchase. Once you’ve purchased a product, you begin to evaluate how well it delivers on the brand promise. At some point, the hope is that you become an advocate for the brand since advocacy is the pinnacle of the brand purchase hierarchy. 

As part of our Consumer Pulse program, we used our EMPACT℠: Emotional Impact Analysis tool to measure consumers’ emotional bond (NPES) with 30 brands across 6 categories. How well does this measure impact other key metrics? On average, Net Promoters score almost 70 points higher on the NPES scale versus Net Detractors. We see similar increases in likelihood to continue (or try), proud to use, willingness to pay more, and “I love this brand.”

NPES.jpg

What does this mean? It means that measuring the emotional bond your customers have with your brand can provide key insights into the strength of that brand. Not only do you need to win on the performance attributes, but you also need to forge a deep bond with your buyers. That is a better way to brand loyalty, and it should positively influence your bottom line. You have to win their hearts—not just their minds.

Dr. Jay Weiner is CMB’s senior methodologist and VP of Advanced Analytics. He has a strong emotional bond with his wife of 25 years and several furry critters who let him sleep in their bed.

Learn More About EMPACT℠

Topics: NPS, path to purchase, Dear Dr. Jay, EMPACT, emotional measurement, brand health and positioning

Dear Dr. Jay: Can One Metric Rule Them All?

Posted by Dr. Jay Weiner

Wed, Dec 16, 2015

Hi Dr. Jay –

The city of Boston is trying develop one key measure to help officials track and report how well the city is doing. We’d like to do that in house. How would we go about it?

-Olivia


DrJay_desk-withGoatee.pngHi Olivia,

This is the perfect tie in for big data and the key performance index (KPI). Senior management doesn’t really have time to pour through tables of numbers to see how things are going. What they want is a nice barometer that can be used to summarize overall performance. So, how might one take data from each business unit and aggregate them into a composite score?

We begin the process by understanding all the measures we have. Once we have assembled all of the potential inputs to our key measure, we need to develop a weighting system to aggregate them into one measure. This is often the challenge when working with internal data. We need some key business metric to use as the dependent variable, and these data are often missing in the database.

For example, I might have sales by product by customer and maybe even total revenue. Companies often assume that the top revenue clients are the bread and butter for the company. But what if your number one account uses way more corporate resources than any other account? If you’re one of the lucky service companies, you probably charge hours to specific accounts and can easily determine the total cost of servicing each client. If you sell a tangible product, that may be more challenging. Instead of sales by product or total revenue, your business decision metric should be the total cost of doing business with the client or the net profit for each client. It’s unlikely that you capture this data, so let’s figure out how to compute it. Gross profit is easy (net sales – cost of goods sold), but what about other costs like sales calls, customer service calls, and product returns? Look at other internal databases and pull information on how many times your sales reps visited in person or called over the phone, and get an average cost for each of these activities. Then, you can subtract those costs from the gross profit number. Okay, that was an easy one.

Let’s look at the city of Boston case for a little more challenging exercise. What types of information is the city using? According to the article you referenced, the city hopes to “corral their data on issues like crime, housing for veterans and Wi-Fi availability and turn them into a single numerical score intended to reflect the city’s overall performance.” So, how do you do that? Let’s consider that some of these things have both income and expense implications. For example, as crime rates go up, the attractiveness of the city drops and it loses residents (income and property tax revenues drop). Adding to the lost revenue, the city has the added cost of providing public safety services. If you add up the net gains/losses from each measure, you would have a possible weighting matrix to aggregate all of the measures into a single score. This allows the mayor to quickly assess changes in how well the city is doing on an ongoing basis. The weights can be used by the resource planners to assess where future investments will offer the greatest pay back.

 Dr. Jay is fascinated by all things data. Your data, our data, he doesn’t care what the source. The more data, the happier he is.

Topics: advanced analytics, Boston, big data, Dear Dr. Jay