Market segmentation is a key part of marketing analytics. Existing parametric models for performing this task, such as the latent class model, are computationally infeasible when used on the larger datasets encountered in current practice. A promising approach is thus to utilize machine learning methods, which handle big data efficiently, towards discovering parsimonious representations of such data. An appropriate stratified sampling scheme can then allow for compressing the original dataset for use by market segmentation algorithms. However, conventional algorithms for discovering such representations generally require detailed demographic information on consumers to work well, whereas in practice only less precise or complex forms of information, such as their purchase history, is available. We thus propose using an autoencoder – a type of feedforward neural network – which can discover informative latent representations by only the latter types of information. We show that our method outperforms existing benchmarks in discovering informative representations of consumers’ shopping patterns. Further, we show how the discovered latent representations can be interpreted when used in conjunction with the latent class model.
market segmentation, consumer heterogeneity, latent class model, deep learning, big data