Sig Testing Social Media Data is a Slippery Slope
During a recent social media webinar, the question was raised “How do we convince clients that social media is statistically significant?” After an involuntary groan, this question brought two things to mind:
Apparently, there’s much debate in online research forums about whether significance testing should be applied to social media data. Proponents argue that online panels are convenience samples and significance testing is routinely applied to those research results – so why not social media? Admittedly that is true, but the ability to define the sample population and a structured data set should provide some test/retest reliability of the results. It’s not a fair comparison.
I’m all for creative analysis and see potential value in sig testing applied to any data set as a way to wade through a lot of numbers to find meaningful patterns. The analyst should understand that more things appear to be significant with big data sets so it might not be a useful exercise for social media. Even if it can be applied, I would use it as a behind-the-scenes tool and not something to report on.
Anyone who has worked with social media data understands the challenging, ongoing process of disambiguation (removing irrelevant chatter). There are numerous uncontrollable external factors including the ever-changing set of sites the chatter is being pulled from. Some are new sites where chatter is occurring but others are new sites being added to the listening tool’s database. Given the nature of social media data, how can statistical comparisons over time be valid? Social media analysis is a messy business. Think of it as a valuable source of qualitative information.
There is value in tracking social media chatter over time to monitor for potential red flags. Keep in mind that there is lot of noise in social media data and more often than not, an increase in chatter may not require any action.
Applying sig testing to social media data is a slippery slope. It implies a precision that is not there and puts focus on “significant” changes instead of meaning. Social media analysis is already challenging – why needlessly complicate things?
Cathy is CMB’s social media research maven dedicated to an “eyes wide open” approach to social media research and its practical application and integration with other data sources. Follow her on Twitter at @VirtualMR