2010 has been busy until now (hence the lack of blogging which we’ll try and remedy) but, for the most part, with a type of analysis we didn’t do any of in 2009.
We’ve been very occupied in recent times with Prediction as the awareness and demand for Predictive Analytics has grown. Though it is demonstrably powerful Prediction isn’t the only game in town, analytically speaking.
Methodologically speaking, standard segmentations are not predictive but they can be described as “Data Mining”. Typically they involve some type of clustering to identify groups (segments) of something (usually people/customers but it doesn’t have to be) which have the most in common with each other. That commonality can be based on behaviours, attitudes, demography, etc.
The several we’ve been working on this year have ranged from relatively quick – survey based - analysis of a few hundred consumers (of a consumer packaged good I can’t mention) to a behavioural segmentation of millions of visitors to a web site (the name of which I also can’t mention).
This type of analysis very much demonstrates one of the key tenets of CRISP-DM in that it takes a joint effort, particularly between the business, research and analysis, to arrive at a successful outcome.
The larger the segmentation the more iterative this becomes and typically more constituents from the business need to be engaged in it. It really is a mixture of hard work (as you work through dozens of profiles and several versions of the segmentation) and fun – as recognisable, and sometimes surprising, segments emerge. When it works – and going in you are less sure that it will than your typical predictive model – the final segments are really clearly defined and can be used as the touchstone for how companies/organisations communicate with their customers, citizens, donors, etc.
To illustrate the point we have performed a number of behavioural segmentations over the last few years of visitors/customer to web sites. Typically these have been “Blue Chip” sites with millions of visitors per week. Whenever we perform these on-line segmentations these days we work with Neil Mason’s analytical team at Foviance. Despite the preponderance of Web Analytics tools like Omniture Sitecatalyst, Webtrends, etc. – which are very useful for many analytical types but web site owners rarely have that higher level strategic view on who their visitors/customers actually are.
Now this is where the trouble starts! To enact the segmentation we typically have to suck out months of data from those web analytics tools, typically at the click level, and then reconstitute it to the visitor level so we can start to segment. Luckily we’ve done this several times now so have some re-usable components and a target database schema which is “segmentation ready”. If possible we also like to directly link attitudinal and demographic data to the visits by surveying enough of them and linking them to specific visits. And we like to add data from customer/registration databases (where they exist) into the pot to give us a rich mix with which we can start to uncover these meaningful segments.
The outcomes and applications of the resulting groups are various. For a newspaper publisher we identified a, hitherto unknown, and valuable segment of career break mothers (and sometimes fathers) who had vivid visiting habits early in the afternoon. The publisher went to run an off-line acquisition campaign to recruit more of them. For The Royal Mail we jointly found a segment of visitors linked to eBay who were running virtual cottage industries. Segments like these need to be able to use the RM site in a very different way to those who are just there to find a postcode for example. Hence this requires some Information Architecture to tune - and in some cases to completely rebuild – the site.
It isn’t all about on-line segmentation of course – though the recency of that channel means that there is a lot to be done there. We’re currently working with a large UK retailer, a research agency and their marketing agency on a multi category, multi brand, multi channel segmentation. When we start a new one it often feels like we’re at the base of a mountain peering upwards. But we usually get to the top. And the view is usually worth the climb.