Recommendation engines like those employed by Facebook, Ebay, and Amazon increase customer and user conversions. They do it by employing algorithms that produce recommendations personalized to the preferences of each individual user. Google Search is the world’s best search engine because Google has access to loads of personal data on its users. It’s also because Google Search deploys algorithms that incorporate this user data to produce search results that are optimized to the preferences of the individual making the query.
To most of us, this is yesterday’s news. What’s new is that SMEs and large corporations are beginning to use this same technology. Their aim is to fuel their own market growth and brand visibility. Access to super-charged data-science-driven growth results is no longer limited to industry giants like Google, Facebook, Ebay, and Amazon.com. What’s new is that people like you and me are increasingly able to implement the same tactics. These tactics are aimed to drive growth for our own businesses or the businesses of our clients.
Recommendation engines are a tricky business. But we stand to gain a lot of market traction if we can employ even a few of the statistical methodologies upon which they’re built. Segmentation analysis is one such methodology. It’s actually a large mechanical component of how recommendation engines work. This type of analysis is also a strategy used by growth hackers. They utilize it to drive the growth of their brands and the brands of their clients. While us data scientist just call it “segmentation analysis”, other, more marketing-minded people out there call this practice “adaptive marketing”.
What is Adaptive Marketing?
“Adaptive marketing” is a means by which growth hackers are able to segment and target users. Main goal is to offer them a personalized brand experience. This personalization of brand experience fuels growth in every layer of the funnel, from user awareness to revenue. Adaptive marketing is built on segmentation analysis. Users and customers can be clustered according to any metric. But clustering according to user personas, behaviors, content engagement patterns, lifecycle stages, purchase histories, or demographics is particularly useful. Being able to group customers in these ways allows us to personalize and optimize marketing tactics. It also allows us to improve website experience, content strategies, product offerings, user benefits, user activation, user retention, and brand messaging.
Photo Credit: Growth Hacking
Segmentation Analysis and the K-Means Algorithm
Let’s take a closer look at segmentation analysis. There are many methods that can be employed to perform this type of analysis, but today let’s focus on k-means clustering for segmentation analysis. k-means clustering is an unsupervised (non-hierarchical) clustering algorithm that can be deployed to group ‘n’ number of data points (where a ‘data point’ is a parameter that characterizes a user) according to their likeness, into k number of clusters. In this algorithm, the analyst defines the number of k clusters, with clustering of observations based on the nearest arithmetic mean value of the cluster.
This method is a variation of the generalized expectation-maximization algorithm. The k-means method is not well designed for analysis of clusters of significantly different size, density, or non-globular shape. The algorithm works best if k is set to a relatively small number. Another difficulty with k-means clustering is that there is no indication of the optimal number of clusters to use when modeling the data. To get around this, k-means clustering should be repeated several times, using several different values for k until the best k value becomes apparent.
Operation of the k-means algorithm
k-means clustering is also a particularly helpful method in geospatial data analysis. In spatial analysis terms, k-means clustering can be used to group spatially proximate points and polygons according to a user defined field in the underlying data set. The algorithm is also often used for image processing / segmentation and spatial data mining. k-means can be performed in R (‘Quick-R), Python (‘scikit’), and ArcGIS (Spatial Statistics Toolset), and CrimeStatIII.