data exhaust (n.)

1) The data generated as trails or information byproducts resulting from all digital or online activities.

2) The digital fuel enabling a new class of intelligent software products and services powered by artificial intelligence.

3) A blog maintained by Garrett Eastham covering the art and science of monetizing data through new product strategy and technical development.

Ok Google, Tell Me About Yourself

Building Machines with Personality

One in four people name their vehicles. In fact, my King Ranch is named Kinsey and she replaced my old Tiburon named Tina. We have been great friends ever since I brought her home from the car lot a few years ago, and by all means, why not? She’s reliable, fun to be with, and takes good care of me and my family whenever we drive across Texas. That is – however – pretty much the full extent of the anthropomorphic relationship that we have built. She is, after all, still a vehicle.

As the world moves gradually more towards a virtually-assisted environment where consumers are naturally comfortable engaging in somewhat natural dialog with machines, software developers and product managers everywhere are wondering whether it is important that these interactions require a digital personality. Digital assistants like Siri and Alexa have made very conscious decisions from the initial releases to include personality into their technology. In fact, part of the digital brands’ reach comes from consumers’ fascination with these virtual agents views on the world. Google, however, has historically refused to build personality into its voice agents – until recently.


Just last year, Google made a key hire in a former Pixar storyboard artist named Emma Coats, whose new role in the Google-plex is specifically focused on developing and driving the personality behind their voice assistant Ok Google. As she works to bring her story-telling background to one of the world’s most advanced artificial intelligence systems, she is also helping to define Google’s own unique strategic stance on design in this realm – such as her believe that AI should play the role of the sidekick versus the lead character, or as she puts it “Slinky Dog, not Buzz or Woody.”

Moving from Search Queries to Conversations

Google’s strategic hire is interesting after taking continued criticism for years over its neutral stance on digital personality. But why is it so critical now? What has changed? Google’s promise to its shareholders has always been to continually drive the development of human-computer interaction with regard to the tasks of information retrieval in whatever ways will lead to its continued profit and dominance of the search market overall. So when the search giant makes such a large move, something must be changing in the landscape as a whole.

This is not the first time Google has had to overcome consumer adoption. In fact, if we remember back to when Google first entered the digital world as a “better search engine,” it was still at a time when consumers everywhere were just being exposed to the Internet for the first time. Google’s clean aesthetic stood in stark contrast to the clutter and energy of competitors like Lycos, GoTo, and others. In fact, the service even forced us to learn a new “pseudo” language of keywords to drive the best results. This stood in stark contrast to competitors like AskJeeves that encouraged users (who were new to the entire concept of a “web page”, let alone a search engine to navigate to those pages) to ask questions in natural language – as if they were asking their local librarian or village sage. Fortunately for Google, not only was the technology for parsing and understanding natural language not advanced enough but consumers’ own expectations of the types of information to look for and inquire about were just not that advanced either.

And thus, Google trained us to use keywords and rewards us daily with relevant content at our fingertips. This has led to its dominance and control over the search experience for billions of people everyday; however, that control is in jeopardy as the voice-assistant landscape evolves.

The Role of the Doodle

Google has employed a Chief Doodler as early in its history as 2000, when Page and Brin appointed an intern Dennis Hwang after a successfully received Doodle for Bastille day. Ever since then, the Doodle has brought a degree of “humanity” to what could very much be one of the coldest daily human-computer interactions in our modern society. The Doodle has such a massive impact on how the search engine is integrated into our lives – in fact, it can even cost the economy millions of dollars when it is “too much fun.”

The fact that Emma’s role reports officially into Doodle team says something quite profound about how the search giant is thinking about the strategic nature of personality with regard to its voice technology. The Doodle team has helped make Google a warm, approachable brand in our hearts, and I imagine Emma’s job description and mandate are nothing short of attempting to do the same for Google’s voice AI’s perception. In a world where Google (and many others) are starting to recognize the potential impact of search queries moving to dialogs, this decision reminds us that only the market can ultimately decide how it wants to embrace new technologies.

Can Personality Win the Voice Wars?

A few weeks ago, I wrote about the coming Voice Wars between two giants of digital commerce: Google and Amazon. With Amazon’s impending release of a paid marketplace for voice-based product queries, an open question still stands as to how profits will shift as search power moves out of the Google sandbox and into new devices and modalities. Since then, Google has made some major moves by releasing its own open integrations with existing retailers – following a similar strategy to its existing dominance over paid ecommerce traffic. This move marks a stern difference in strategy between the two firms that only the market can answer: do customers value variety of retail providers or do they value price and convenience within a close Amazon ecosystem?


Perhaps Google is making a bet with Emma’s position and role that personality will help provide a competitive advantage as consumers make their decisions with their voice and wallet. Personally, I think the real answer will only arise as search technology evolves to support true interactive dialogues. The movement towards bots in the past year has shone a light on just how far we still have to go in creating real conversational value between humans and artificial intelligence. Both Amazon and Google have a rich history and strong technical chops with regard to innovating around the product search and discovery process, so I believe there’s still a great deal of growth ahead of us. Whether personality will play a critical role – or just be a movie side-kick – is still to be seen.

Can We Teach a Machine to Sing?

The Artificial Musician

Recently in the past few weeks, a new artificial intelligence startup, Amper Music, has begun opening up a private beta of its AI-based music composition platform. This system claims to be capable of generating the musician-quality original music, but packaged up and delivered as a service. If their claims are true, this service could fundamentally change the way music is created and delivered for a whole variety of background tasks from restaurant ambient noise to the quintessential elevator music. In a world where the rising cost of music licensing is eating into the gross margins of music streaming services, the premise of an always-available, never hung-over, and infinitely scalable band seems like a godsend to those in the music industry who are trying to help navigate it through its digital transformation safely. However, as stewards of the human condition, we have to ask ourselves: is this a positive or negative net benefit for mankind?


There are, of course, several substantial benefits that an Artificial Musician could bring to the world that just are not possible with just their human counterparts. For example when trying to generate background music for say a powerpoint presentation, an algorithm can identify and understand nuances of the digital context of the presentation – such as the known preferences of the registered presentation attendees pulled from their Spotify profiles – that could be used to tailor the final musical result to not just the tone of the content but also how it might best be perceived. While a human musician could possibly work to understand the tone of the presentation and work with the author to create a compelling work of art, it is simply not possible to take into consideration (simultaneously) the wants and needs of an entire audience full of  stakeholders while also creating a composition on time and on budget.

Muse vs Machine

Interestingly, Amper is not the only company leveraging artificial intelligence to disrupt the music industry. LANDR is a new music technology platform that provides an AI-assisted studio authoring environment at a fraction of the price of professional-grade services. The core premise with their model is that much of the expertise that studio techs bring to music production process is routine and automatable through machine learning. Their platform claims to provide a cheaper, scalable studio process by removing the expensive, manual labor from going from raw creativity to a studio-mastered track. In theory, leveraging LANDR’s platform would not only widen the margin creative artists could make from the sale of their music, but it also could open up a much wider array of independent or part-time musicians to the world of professional music – in a way, increasing the overall creative mix of content in the world by removing the middle-men between creativity and profit.


When contrasting LANDR and Amper’s different approaches to augmenting music production, one must ask where does the human play a critical role in this unique aspect of our culture? A production-assistance platform such as LANDR takes an implicit stance that the spark of creative value resides in the mind of the musician and we simply need to reduce the friction in getting it to market. However, Amper’s product would suggest a view that musical genius – at least for certain use cases – follows a formulaic, automatable process and a machine can be trained to share the same qualities as any other muse out there. For the record, Amper’s leadership does not wish to replace composers, but rather they feel there are two distinct use cases for musical composition – either creative or functional value – and AI can absolutely be used to replace one of them.

How Will Consumers Respond?

Fortunately for mankind, these enterprises do not control one critical factor: consumer opinion and preferences. A big question looming over AI designers in the music field is just how good are the tunes that these systems can create? Will an AI be capable of weaving together a beautiful melodic story as full of depth and intrigue as any piece Mozart or Bach could whip up?

Probably not.

However, as Amper pointed out, there are plenty of use cases where music does not have to be the absolute best creatively – think studying music for example. In these cases, the limiting factor to growth and adoption is cost and scale – two elements that automated music has an undoubted competitive advantage in.

This, of course, should be taken in light of just how much of the music industry’s business model has been shifting over the past 20 years. Ever since early peer-to-peer file sharing services like Napster or LimeWire entered the scene in the late 90s, the music industry as a whole has been forced to shift focus from creating profit in the creation and distribution of digital media to how the artists are experienced. With album sales declining and music streaming services enabling consumers to “purchase” music with their ears, record labels and artists are being forced invest heavily in creating unique, engaging performances that cannot be digitally consumed. This combined with the fact that millenial shoping habits are shifting towards the consumption of experiences over hard goods en masse is providing a much needed increase in demand for these types of physical performances – enabling a shift in profit from music sales to music experiences.


The real question, of course, is what value do music customers (ordinary humans like you and myself) place on the act of creativity that goes into making a hit song? Will a machine ever be capable of drawing the same types of crowds as Taylor Swift or The Weeknd? The companies innovating in this space will ultimately be beholden to their customers, who will decide with their wallets over time. If there truly is enough space for two kinds of musical consumption, then perhaps true human creativity will prevail.

Creativity and the Human Condition

While music is an art form that touches us perhaps more emotionally and more frequently than many others within the liberal arts, there are many more fundamentally pivotal human activities that make up today’s knowledge economy. Writers, sports commentators, comedians, and actors – to name a few – all derive their value from harnessing their creativity. These kinds of professions do not have formal degrees in in universities because they have always been thought to be derived from elements of human nature that cannot be taught but rather embraced. As AI continues to permeate every part of our lives, society will need to make a stand on how we value the creative arts. Is it more important to lower the barrier for humans to drive acts of creativity or do we believe that it is inherently a process that can be modeled and encapsulated in software?


Time will tell.

Designing Product Recommendation Engines for the New Age of Digital Commerce

What is a Recommendation Engine?

In today’s world of rampant digital commerce growth, the topic on every executive’s mind is personalization – or rather how can the ecommerce experiences of tomorrow provide customers with product guidance tailored specifically to their unique needs and tastes. While this is not a new topic in the world of ecommerce innovation, it is has certainly been an evolving topic over the years as the techniques and technology – which some claim is responsible for up to 35% of Amazon.com’s total online revenue – has moved from a proprietary form of intellectual property to an entire studied field of machine learning, complete with open source frameworks and datasets.

While product recommendation initiatives are now so pervasive throughout nearly every digital commerce touchpoint from online discovery, search personalization, and email targeting, it was not always this way. Amazon.com was the first digital retailer to provide a consumer-facing implementation of a product recommendation service, more commonly known to digital shoppers as the “People Also Viewed / Bought / Liked” product carousels that have appear on Amazon product pages as early as 1999. Since then, digital commerce has benefited from a scale of personalization that far out-stripped the ability of site merchants to help curate a tailored set of goods for each and every shopper – a tension that we will talk more at length about later. Innovation in product recommendation is so critical that it remains an active application of emerging new techniques in machine learning (such as deep learning) – Amazon, for example, has recently open-sourced the deep learning framework it uses to perform large-scale latent factor analysis that is designed specifically to help overcome the sparsity issues that plague large scale factor models (a topic we will cover later).

Of course product recommendation is not just an area relegated to physical goods retailers but rather any digital entity that wishes to help ease the content discovery process for its users can make use of product recommendation. The most famous of these is perhaps the video-streaming service Netflix which leverages product recommendation techniques everyday to help tailor its video viewing experience. Netflix has also made significant R&D investments into the field at large – going as far as to sponsor its own machine learning competition called the Netflix Prize.

Teaching a Machine to Merchandise

As I mentioned, product recommendation is now so ubiquitous that we as consumers often forget that it even exists within the digital services we use everyday; however, for data scientists or digital commerce executives working to improve their online performance, it is a critical field to understand. Today, much of the field of product recommendation is common knowledge (at least the classical techniques of large-scale matrix factorization) and the software to enable it is open-source. By themselves, these techniques are useless unless they are properly integrated into the customer experience in an elegant way to drive relevant, tailored digital experiences.

Customer experience personalization is all about data first. Get the data right and you can shape the overall customer experience by applying data science and machine learning.” (MartechAdvisor)

In fact, because these systems are readily available, digital strategies are evolving to focus on the new critical IP – customer and product data – and developing robust new ways to collect, mine, and apply new machine learning techniques to these data sets. In this tutorial, it is my hope to help provide a thorough understanding of just how modern real-time personalization systems work, with a specific focus on how the uniqueness of customer and product data can create competitive advantage for digital projects. In the following material, we will cover:

  • Digital Strategy: We will use a framework called the Machine Learning Canvas to analyze a generic product recommendation system.
  • Data Strategy: Designing the best product recommendation system for your digital organization is about more than just technology – in fact, the form and completeness of your data can have the largest impact to the overall results.
  • Technical Implementation: We will build a simple real-time personalization engine leveraging open-source frameworks (Spark) and some simple code examples that any developer can play with on their own.

The Framework

I would like to introduce a toolkit called the Machine Learning Canvas – a template for developing new or documenting existing predictive systems based on machine learning. It was developed by Louis Dorard and it provides the necessary mental bookshelves upon which to organize our analysis:

 

  • Core Premise:

    • [A] Value Propositions: What are we trying to do for the end-user(s) of the predictive system? What objectives are we serving?

    • [B] Decisions: How are prediction used to make decisions that provide the proposed value to the end-user?

    • [C] Making Predictions: When do we make predictions on new inputs? How long do we have to featurize a new input and make a prediction?

  • Required Resources:

    • [D] ML Task: Input, output to predict, type of problem

    • [E] Offline Evaluation: Methods and metrics to evaluate the system before deployment.

    • [F] Data Sources: Which raw data sources can we use (internal and external)?

    • [G] Features: Input representations extracted from raw data sources

  • Model Operations

    • [H] Collecting Data: How do we get new data to learn from (inputs and outputs)?

    • [I] Building Models: When do we create / update models with new training data? How long do we have to featurize training inputs and create a model?

    • [J] Live Evaluation and Monitoring:Methods and metrics to evaluate the system after deployment, and to quantify value creation

ML Canvas: Recommender Systems

Core Premise: The most basic function of a product recommendation engine is to help show end users products “they might want.” This is such a broad value proposition that can cover a number of more tailored use cases – such as which headline to pick when crafting a newsletter email to drive a return visitor to the site or which hero image to show to a new site visitor; however, the most common use case we will frame our discussion around is the classic product page carousel where the task at hand is to select the set of relevant products to the current one that a user “might also like.”

In the above “product carousel” scenario, the active decision that a machine learning model must serve is selecting the set of products to provide back to a product carousel widget that renders on page-load. While there are methods for doing recommandations in an “offline” mode (i.e. – the system is not constrained by time for generating recommendations), most systems powering this use case require recommendations to be provided in “real-time” as the 100’s of other pieces of the product page HTML and digital assets are being rendered by the web browser. A common problem in designing recommendation engines is determining how to make the best personalized set of results given little to no insight into the user’s background in such a limited about of time (less than 100ms for example) – we will cover a mental model on how to think about this later.

 

Required Resources: The data required to train and optimize product recommendation models can come in a variety of unstructured (clickstream) and structured (reviews) formats, we will generally refer to the type of data that we want to capture as Product Affinities. Affinities are simply a designation of data that might provide some connection between a given user and a given product. It can be as explicit as a user writing a 4-star review or as implicit as an anonymous user visiting a page and reading that previous customer’s review for a bit “longer” than the normal page visitor.

We will structure our model features in terms of User-Product affinities such that as long as the data can be transformed into a numerical relationship between a user and a product, it can be leveraged for personalization efforts. As we will see later, this data will be used specifically for the task of candidate scoring, which can be thought of as a type of classification process where the system attempts to “assign” a number to each product based on how likely the user is to have an affinity for that particular product. In our model development, we will use the product-affinity data to train a machine learning model that attempts to represent the entire populations affinities for all products based on just a small “sample” of known product affinities.

 

Model Operations: While we will not cover the data collection process in this tutorial, this is perhaps the most critical area to invest in when designing your organizations own product recommendation system and process. Today’s example will leverage user ratings collected on movies; however, the more common form of data within ecommerce will be in the form of implicit / unstructured clickstream data (i.e. – the data exhaust generated from users trying to find products on your website). Nearly every single digital product catalog generates the necessary information to build a well tailored recommendation service – the data just simply needs to be captured (typically via a tracking “beacon” deployed in the background as a shopper loads the page) and leveraged by a data scientist.


Data scientists working on product recommendation for an organization can leverage this dataset to perform offline training and testing (perhaps on a small portion of the data) of a User-Product affinity model – like the one we will work through today. The model can then be trained in a semi-regular, batch process (usually at night) once it is initially developed. While we don’t need to build our models in real-time with our specific product carousel use case, we do at least need to have any relevant affinity information about the target user collected and made available in real time. This can be engineering in a variety of ways from using a real-time preference cache served within the recommendation API service or even stored locally in the web browser on an ecommerce site.

The Merchant Versus the Machine

Historically, there has been a lot of tension between site merchants within ecommerce organizations and digital product recommendation engines – as both view their job in a similar vein: choose a set of relevant products to show a customer. Merchants, in general, practice the coveted art of product merchandising – which I would describe broadly as the act of attempting to understand evolving customer preferences and predicting the likely set of products that will maximize their financial “bet” (with regard to a particular retailer’s financial allotment for a given category). Merchandising is so core to retailing as an industry, that the world’s leading retail organizations can trace their roots back to core merchant-driven leadership – such as Sam Walton (Walmart) and Pat Farrah (Home Depot).

Given the role merchandising plays in retailing strategy, systems that have attempted to claim competency in a “similar” function have often been received with a heavy dose of skepticism from internal teams. While those digital teams that have been able to overcome this bias have made significant gains, it is critical to re-frame the issue not in terms of who is better at what but rather who’s time is better spent where. When you think about the trade-off between merchant time and financial return, selecting products to merchandise with other products is incredibly inefficient with regard to machine-based recommendations – not to mention as digital retail catalogs continue to scale up each year, the number of necessary merchandising decisions per product grows exponentially. Instead, the role of product recommendation engines should be viewed as a vehicle to enable merchandising organizations to apply their valuable time and mental energy to more financially efficient activities such as selecting a title for an email campaign or updating an open-to-buy order.

Data is the Fuel for Algorithms

Most digital executives mike think that competitive advantage within product recommendation comes from innovation in the actual algorithms driving the product recommendation process, this is quite the opposite in today’s world. The truth is that while there is a great deal of innovation that can happen at the algorithm level, these kinds of gains will be minimal at scale relative to off-the-shelf frameworks that are free to integrate (and often times more battle-tested than experimental in-house systems). In today’s landscape, the true differentiator that digital teams can bring to the entire process of designing and improving product recommendations comes from the quality of the data that goes into the models being developed.

We will cover a few examples of this later, but take a simple scenario of a shopper who is searching for a new laptop. Having well structured, relevant product and customer data improves recommendation results in two critical steps during the process:

  • Candidate Selection: The first step in any recommendation process starts first by identifying a list of all relevant laptops that might match a user’s particular inquiry. Selecting the “right” set of products to fill the recommendation “hopper” can provide the biggest gains to any personalization initiative – this is because in most recommendation scenarios there is relatively little information about a users tastes to properly rank products and, therefore, the system must rely on raw contextual information to determine a good set of starting products. This is called the “cold start” problem in the world of product recommendation; however, this can be overcome with well structured meta-information on both products and customers. In the example posed here, a shopper who searches for “large screen laptop for gaming” should immediately filter out laptops with 13’’ screens; however, if the product catalog doesn’t have accurate or even complete screen information the recommendation system will fall flat on its face – regardless of what unique algorithm is providing the ranking.

  • Candidate Ranking: Once a set of potential products have been selected, a product affinity model (like the one we’re going to build) can provide the next level of personalization by enabling the “final mile” of personalized advice that might have been delivered by an acute sales associate. In this example, the savvy sales professional might have an internal “sense” that this shopper – who we know to be Design-Centric – would appreciate the overall aesthetics of the 3 screen Razer Stealth.

Predicting Consumer Product Affinity

So, how do we actually teach a machine to predict whether a customer might want for the products a shop might carry? The correct way to approach this problem to re-frame the task from predicting the abstract human emotion of desire to one of pragmatic affinity. By using data from scenarios where a user has expressed a known affinity (i.e. – the product rating that user might give a laptop), we can construct an abstract model for how all users might perceive that laptop.

The machine learning task, then, becomes one of predicting a user’s possible product rating given their history of ratings on similar products.

Selecting the Right Model

Before we dive into the specific algorithms we will leverage and use to train our model, we first start with the general framework: the large, sparse User-Product affinity matrix. Most recommendation engines are based on a concept called Collaborative Filtering, which is simply the idea that like-minded people are likely to show strong affinity for the same products. The key to making this work in practice is how we define “like-minded people” – in most cases this typically resolves to the recursive reference: “people are like-minded if they like similar products” and “products are like-minded if they are liked by similar people.” This reference is similar to the recursive principles of authority that score relevant documents on the web (PageRank), where “an web page is authoritative if other highly authoritative pages link to it.”

Conceptually, we must imagine a giant matrix where one dimension is all of the possible users of a website and the other dimension is all of the possible products those users could potentially like and/or purchase. As you can imagine, this is a LARGE matrix and in reality it is very sparse – meaning only a small portion of user-to-product affinities will be known (this is called the “sparsity” problem in the field of collaborative filtering based product recommendation techniques). Our goal, from a machine learning perspective, is to build a model that can predict the full matrix (Q) using what – in reality – is often less than 0.01% of known user-product affinities.

We Need to Approximate All Preferences

The trick to solving our problem is to setup a model that will allow us to approximate the entire matrix Q based on a much smaller set of matrices called “latent factors” (see the picture on the left). The reason for this breakdown is a critical concept to recommendation engines: the user-product affinities we see in the world are actually the result of preferences being formed over a (relatively) small number of factors that are “hidden” from the world but drive end behavior (kind of like how we think of the human mind forming concepts as internal mental representations). These “latent” factors can then be divided amongst users – who have different affinities for those factors – and products which have these factors in various degrees. The resulting large matrix Q is then simply the dot product of these two latent User and Item factor matrices.

There are a variety of algorithms that can be used to estimate these latent preference matrices; however, we will use a technique called Alternating Least Squares as it has both a clean implementation in Apache Spark’s standard machine learning libraries and also provides clean User and Item factors that can be used later on for potentially derivative machine learning tasks involving customer preference formation and usage. I will not cover the details of the algorithm or the math behind how it converges on a reasonable approximation using a small amount of training data as many other sources exist to provide that level of detail.

Our Data Source: Movie Ratings

All machine learning approaches require us to start with a defined set of known product affinities. In this example, we will be using the canonical MovieLens 20 million dataset which is a collection of explicit user ratings on movies collected by the GroupLens project from the Social Computing Research center at the University of Minnesota. To get the dataset, simply go download a copy of the ml-latest zip file and make sure it contains the large CSV called ratings and movies. This file is simply a flat file containing a specific user ID, product ID, and the rating that was given.

First – Determine the Correct Hyperparameters

As with many other machine learning algorithms, Alternating Least Squares (ALS) has several hyperparameters that need to be selected during the training process. One of the most important element is called the Rank of the resulting latent factor matrices. This is essentially the number of different “factors” that we want to represent the large matrix Q by – essentially the different number of dimensions a user might be weighing in their mind when determining their personal affinity for a film.

As you can see from the left, we break the data into a training and test set and leverage the Spark ALS API to train different models with different hyperparameter settings. The method for testing the accuracy of the resulting matrix factorization model is called Root Mean Square Error – it is a measure of how well the resulting latent matrices (when multiplied together) can reconstruct the known product affinities.

Then – Train on the Full Dataset

Once we have a set of hyperparameter settings that reduce the RMSE the most, we use these settings to train on the entire data set. In this use case, I’ve selected a rank of 10, lambda of 0.1 and 10 iterations of approximation. We then save the model to disk both more efficient retrieval and later re-use.

Now – Let’s Use the Model

With the model trained, we can now use the resulting latent item factors to find similar films and make recommendations – regardless of what we do or do not know about a user. Remember that for our use case of a product carousel on a product page in an ecommerce site we want to recommend other products a user might like. A good starting point for this recommendation would be to use the product a shopper is currently looking at to make the recommendation.

Let’s say that a shopper has been looking for some good Nicolas Cage films and stumbles upon the legendary Cage masterpiece, Con Air. While I’m sure those who have seen Cage’s southern rendition of an ex-con caught up in the wrong place at the wrong time would never desire to see anything else, let’s say – for sake of mental exercise – that we wanted to find other similar films to Con Air.

Since we have determined a latent item matrix Y where each movie is represented in a factor space of rank R that represents how the movies relate to each other in terms of known product affinities, we can use these pre-computed vectors to find similar films in the factor space using simple Cosine Similarity distance.

Finding Similar Films to a Given Target

Leveraging the Spark SQL API and merging our recommendations with the movie list (available in the MovieLens download), we can find the top 20 most similar films to Con Air in terms of what other people have preferred. Looking at the results, we see a lot of nice blockbuster action films like Bad Boys or even another famous Nicolas Cage film, The Rock. In fact, Nicolas Cage makes an appearance in quite a bit of the top recommended films – this is not surprising since our model is based on what similar people that like Con Air have also liked. The data here suggests that Nicolas Cage fans are quite loyal to the action star – and as a fellow fan, I must say that is quite true!

This is actually a very common starting point for most recommendation engines and in our prior model encompasses the “Candidate Selection” phase of machine based product recommendation.

Online vs Offline Computational Requirements

While we could stop here and get some great extra bang for our buck by releasing the model into the wild (which might not be a bad idea for agile development teams), this model would show the same top recommended movies for every single visitor – somewhat defeating the purpose of a “personalization” engine. The next step in our development process is to determine a way to tailor the recommendations to what we know about a particular user – enabling us to provide a true personalized experience.

This personalization process will take place in the “Candidate Ranking” phase of our systems workflow and there are a couple of questions we have to ask ourselves with regard to its design. The primary question is with regard to how quickly do the recommendations need to be provided. There are two approaches to making predictions – either in a batch offline process or in real-time upon request. If we have the luxury of creating recommendations offline (such as when we want to select personalized products to put into a customer email that we control the time of sending), then we can simply take the user’s collected affinities (from reviews / clickstream) and re-build the latent factor model. This is the most straightforward approach and is guaranteed to incorporate a greater deal of nuance in the recommendations than real-time models.

However, if we need to personalize in real-time (or if this is the first session we have seen the user in on our site), then we need a more creative way to approximate a specific user’s preferences P(u).

Modeling Customer Preferences from Behavior

Let’s say, for example’s sake, that we want to personalize the movies recommended when a user finally lands on the Con Air product page after a normal shopping session. Perhaps they search initially for “Nicolas Cage” and then spend varying degrees of time on different Cage films, such as Gone in 60 Seconds (a classic!), The Rock, or even Lord of War. While the user did not provide explicit ratings of each film – as the users from which we built our model – we can still use this implicit browsing behavior to approximate what their known ratings might have been (or rather to at least weight the recommended results towards implicit taste).

In this case, a common technique in the realm of digital commerce is to use the length of time a shopper spends on a particular product page as an indicator of relative affinity. This is known in the industry as user “dwell” time and is likely already being used by state-of-the-art personalization engines on most sites that you shop on today.

Some Quick Math

While capturing and quantifying dwell time can help us approximate the known product affinities P(u), in order to get the unknown product affinities P’(u) we need to do a little linear algebra. On the left, I show how we can approximate what the visitor’s User factor vector (X(u)) might be by using the identity property to substitute into our original approximation equation.

Computing Full Preferences (In Real-Time)

Once we have our approximation for the User factor vector X(u), we can use this in conjunction with various weightings of the movies from the user’s browsing session to personalize the movie recommendations once they land on the Con Air product page. Because this calculation is a simple vector dot product, this method can be used to personalize results in real-time without having to re-build or adjust any machine learned model.

Let’s See the Results

To test this out in our toy model, I’ve simulated two very different browsing scenarios that attempt to emphasize different kinds of Cage fans.

  • [A] “Lord of War was my JAM!” – This shopper shows an affinity for fast-paced, action films that highlight themes related to weapons and military. In this case, they spent a lot of time with Lord of War (a movie where Cage plays a high-profile gun trafficker) and The Rock (a film where Cage and Connery work together to take down a madman general). The engine recommends Assassins, a Stallone classic that is (not surprisingly) very similar to Con Air in that the lead character wants to leave his life of crime but get brought back in due to uncontrollable circumstances. The second film, Rapid Fire, is less well known but seems to also feature a reluctant character who is forced into action by uncontrollable circumstances – both films seem to provide the desired association to weapons and military.

  • [B] “National Treasure was the best Cage ever…” – While I may not fully agree with the tag line, this browsing session is meant to showcase a user who appreciates Cage’s ability to provide humor and play non-warfare oriented rolls. As a result, Assassin is still recommended, but the second film – Blue Streak – is a much more light-hearted cop comedy with Martin Lawrence where again the protagonist is forced into a crime situation undesirably but – in theory – should tailor more to the user’s implicit preferences for humor above action.

How Can We Improve the Results?

Now that we’ve built a working real-time personalization prototype, the next step would be to design a way to go A/B test it on real customers. This is beyond the scope of this article; however, assuming that a digital team was able to successfully deploy and test an initial model, the next logical step any practicing data scientist is going to focus on will be how to improve the results above some measured baseline. With regard to the approach we’ve outlined here, there are two important approaches that can be taken:

  • [A] Leverage Structured Meta-Data to Collapse / Expand Matrix Q: The quality of the product recommendations all result from how strong of a learning signal is present in the training data that feeds the User-Product affinity matrix Q. Because this matrix, at scale, can be quite sparse – a powerful technique for improving results is to leverage structured meta-data to reshape the training data prior to matrix factorization. For example, if you have well structured product attribute information on you products (i.e. – every product has normalized, accurate height, color, style, occasion, etc.), you can collapse the size of the matrix P along a single dimension by consolidating categories of goods into attribute clusters. This will remove a great deal of sparsity from the overall matrix and provide richer and more accurate product recommendations. This approach is generally called “Neighborhood Methods” in the literature.
  • [B] Increase the Training Dimension Along Time: One of the key insights into product recommendations is that consumers’ tastes change over time – in fact it was this insight that led the winning team for the Netflix prize to introduce a winning model called TimeSVD++ that took into account the sequence of people’s ratings to allow for greater insight into the ratings that actually matter.

Voice Wars: The Battle for Your (Audible) Intent

A Dispute Emerges

Amazon is on the cusp of launching the world’s first paid search platform for voice interactions – and digital commerce leaders across the world should be anxiously watching. While customer adoption of digital voice assistants – such as Amazon Echo/Dot or Google Home – has been relatively slow, this past holiday season put more than 8 million Alexa’s into the homes across the US. This means that everyday consumers are interacting with this new technology – asking questions, seeking advice, and generating rich intent signals that are ripe for digital advertisers to monetize. At 8 million daily users, the amount of voice interaction is roughly equivalent to an ecommerce site the size of Best Buy or Target.

Or course, not everyone is talking to their Alexa’s with the intent to purchase a new television set for the home, and in fact, there’s still a lot of work ahead to improve the experience around voice search; however, that hasn’t stopped Amazon innovators from exploring new types of ordering strategies, such as exclusive deals only available to Alexa owner. But the real question on everyone’s mind (at the digital leadership level) is how will this impact my business – in particular my organization’s ability to acquire paid traffic from high-intent, high-value search activity.

The Changing Battlefield

If we look at the paid search industry as a whole, it has really been just a history of one company’s complete and utter rise to dominance and control over the consumer’s quest for knowledge: Google. For context, Google’s paid search offering (i.e. – the ads that show up next to the links that you are looking for) brings in roughly $45 billion annually and wields an incredible 55% of total worldwide paid search revenue – a number that would likely be higher if it were not for the likes of emerging Chinese leaders like Baidu. It has risen to this position in part through its superior search experience but also – more importantly – the introduction and scaling of the paid search bidding mechanism. Without the work that Google and several other search innovators (GoTo, Overture, etc.), managing and optimizing web-scale search traffic would be just an interesting computer science problem rather than a cash cow.


Google’s paid search offerings allow it to monetize its traffic at scale and, in return, pour more R&D dollars back into improving the overall experience for its core users. None of this would be possible if consumers chose to conduct their search elsewhere – and it is in part because of the open web that keeps the Google giant innovating on its core search experience (moving to mobile in the past few years as well as investing in new digital products that may or may not succeed). Amazon’s move into paid voice search stands to threaten this monopolistic hold on the search advertising industry not only because they are well positioned to potentially create the next killer monetization format but also because they are likely to control the landscape for voice traffic altogether. How Amazon ultimately discovers how to monetize this traffic will shape the long-term evolution of voice-based search altogether as it will directly enable (or not enable) the pouring of R&D dollars back into the core voice experience.

A Digital War is Coming

As the world’s largest digital retailer, Amazon has a vested interest in understanding how to leverage any and all consumer intent signals to convince shoppers to purchase goods and services from one of its many digital channels. The rise in digital voice assistants presents a new landscape in which customers can express their intent – and as such as new opportunity for organizations that own that intent to monetize it.

 

Amazon is well-versed in maximizing digital marketing dollars to acquire and convert online shoppers. In 2015 alone, Amazon spent $2.8 billion on total digital marketing spend – making it the largest digital retailer by total spend (followed closely by HSN at number 2 and Wayfair at number 3). Out of this gargantuan allocation, almost $200 million was paid to Google alone for paid search marketing – making it one of the search giant’s largest individual customers by total ad spend. This relationship, however, is becoming rocky at best.

 

The Seattle giant has been steadily evolving its own brand around product search. In 2003, it created a dedicated product search R&D division named A9 that focused (and still does) specifically on developing and maintaining technologies designed to optimize consumer inquires around finding products that are right for them. And to-date, this group has contributed to a massive shift in consumer behavior around just how products are found and discovered online. Today, a whopping 55% of consumers say that they start their online product search on Amazon directly – not Google where market dynamics can let any digital retailer bid on the traffic.


Not only is Amazon gradually pulling paid search power away from Google through changing customer behavior, it has also backed up its efforts to monetize this traffic in ways beyond direct purchase activity through its own online paid search business. Amazon launched ClickRiver in 2006 as a division focused on monetizing internal product search traffic through the same paid search dollars that digital retailers are currently allocating to Google. To-date, that business has grown quite steadily – pulling in over $750 million in 2013 and likely surpassing $1 billion soon. To say that Amazon is tactically positioned to upset not just Google’s business but the entire arbitrage system that the rest of the world of digital retailers rely on is an understatement.

Who Will Win?

Unfortunately, it’s still too early to tell how this opportunity will play out because voice search is still in its infancy. Neither Amazon nor Google have released any stats regarding the breakdown of types of queries that are coming through their respective voice platforms – and realistically speaking product teams are likely still learning how to even classify different types of queries altogether as consumers learn how best leverage the technology that exists in their homes today. In the meantime, however, several core questions will have to be answered:

  • How will people use voice search differently from mobile or desktop?

  • How will targeting work?

  • What will the ad unit be?

Should I Be Concerned?

As a digital commerce leader, it is your job to plan for the unexpected and in-turn create and execute strategies that will maximize your organization’s market influence. Given that Amazon has not even publicly announced the details of their paid search platform, it is still too early to plan any specific changes to your own paid search spend, and in fact, the companies that will leverage it first are likely going to use experimental ad dollars. If your organization can afford to do this and you have the product management resources on your end to manage the experimentation, perhaps it might be a good idea; however, the vast majority of digital retailers are more likely better off letting early adopters pay for the hard lessons and product evolution that will make this channel viable – or not.


In the meantime, we can all look for core stats regarding the viability of the voice search business altogether – such as total number of daily users or queries. Also equally important is the percentage of total queries that are going to either Amazon or Google, because depending on who wins that battle will dictate if other digital retailers can even participate in this new ecosystem or not.

Bitnami