The ultimate guide to Big Data for travel brands in 2020
What is Big Data for travel?
We are witnessing a shift that will completely transform travel. The term given to this shift is Big Data and it will change everything, from the way we travel to the way we interact with travel suppliers to our experience at airports.
No matter what part of travel you are involved with and no matter what job you work in, Big Data will transform it. However, there is a problem with Big Data. The problem is not Big Data itself, rather the hype.
The hype around Big Data and the term itself may eventually disappear but the phenomenon is here to stay – and it’s just starting. Tim Harford summed this up in a recent Financial Times article by saying “Big Data has arrived, but big insights have not.”
What we call Big Data today will become a ‘new normal’ and all businesses and government will have large volumes of data to improve what they do and how they do it. And most travel businesses create a large volume of data and have access to a lot of customer information, but they don’t really know how to leverage it to make good strategic decisions. Without this foundation, adding big data into the mix often adds little value.
More importantly, big data lacks actionable information with which travel businesses can make effective decisions that benefit their customers and their bottom line. Big data can reveal much about what’s going on, when it happens and where it happens. But travel businesses haven’t really arrived at the day when big data can reliably tell us why customers behave in a certain way.
But we are getting ahead of ourselves. Let’s start out with the basics: what exactly does Big Data mean?
What is Big Data?
The term Big Data speaks to the fact that we can now collect and analyse data in ways and in volumes that was simply impossible only a decade ago. The field has grown rapidly since then with many businesses, especially in the Internet sector, creating completely new business models and hugely profitable enterprises based on the principles of Big Data. Analyst firms now also actively track the market and there are formal definitions that separate Big Data solutions from traditional databases and applications.
The most popular of these was provided by Gartner in 2012, which defines Big Data to be “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing”.
These “3 V’s” of Big Data are very useful to help identify what a Big Data system looks like, but they don’t reveal the pace of change and progress we are experiencing. There was a time only 10 years ago, when a Terabyte was seen as the biggest achievable unit of storage. Today, we can all buy a laptop or PC for a few thousand dollars with a Terabyte of storage, or better still purchase a Terabyte of storage from a Cloud provider like Dropbox for $99 a year. So Big is not what it used to be…!
Now, you can store an awful lot of data in a Terabyte of storage capacity and if you can scale that up to 480 Terabytes, you could store a digital catalogue of all the world’s books in all languages. That’s almost half a Petabyte. Scale that up to 5000 Petabytes or 5 Exabytes you can store all words ever spoken by every person that ever lived. And we’re still a long way from a Zetabyte, which would require a staggering 250 billion DVD’s to store this volume of data.
So if we have 4.4 Zetabytes of data globally today, where is it all coming from?
Connections: Generating more data on everything
Up to 2010 the primary source of data growth in the world of computing was PC’s. Today, that is different. Everything we do in our increasingly digitised world leaves a “data trail”. This means the amount of data available globally is growing dramatically and rapidly. So much so, that we have created more data in the past couple of years than in the entire previous history of mankind.
Most of the data is coming from the billions of connections we are creating with each other since the advent of the World Wide Web. This includes the messages and emails we send each other every second via Email, WhatsApp, Facebook Messenger, WeChat, Twitter etc. but also from the one trillion digital photos and videos we take and send to each other each year. What we search for on search engines like Google and Bing is also very important, as is what we actually purchase on e-commerce web-sites like Amazon. All of these things together represent a sort of collective consciousness that capture our preferences, desires and behaviours, both as individuals and as groups or populations of people.
Looking forward, this is only going to grow exponentially, as the Internet of Things (IoT) really takes off. Think of all the data from the sensors we are now surrounded by already. The latest smartphones have sensors to tell where we are (GPS), how fast we are moving (accelerometer), what the weather is like around us (barometer), what part of the screen we are touching (touch sensor) and much more. We now have smart TVs, smart watches, smart meters, smart kettles, cars and even smart light bulbs. Of course, we also have interconnected laptops, mobile phones, tablets, Fit Bits, semi-autonomous (parking) and networked cars (Bluetooth), home appliances and energy management systems. A more recent development is ubiquitous imaging at scale e.g. from large numbers of low-earth orbiting satellites from new entrants like Terra Bella and Planet Labs, that challenge the most scalable data storage infrastructure.
Cloud: Big infrastructure for Big Data
With all of these connections generating such large volumes of data, Internet companies like Yahoo, Amazon and Google needed to find more scalable, low cost and “elastic” solutions to store this data. The solution was to purchase large quantities of commodity servers and storage hardware and hook them together into a single cluster in a data centre. To make this work, software like MapReduce and Hadoop was written, which was then published to the public domain as open source software from the Apache Foundation. This trend continued, with many more contributions to this open source initiative, including Spark which brought this work into the realm of real-time processing and machine learning.
All of the large tech companies saw this trend and started to build administration software around this cluster, while also offering access to their clusters as a Cloud service. This essentially moved these clusters into the realm of enterprise computing and it enabled not just Big Data, but other important shifts in the IT sector such as Software as a Service. The key to the success of the Cloud is its scalability, but also its elasticity (the ability to ‘spool up computer resources quickly, with no fixed costs). This Cloud capability makes it very attractive for every business on the planet that does not really need to have the hardware it runs its software as a core competence. And this includes all travel businesses.
Three Types of Analytics
Big Science for Big Data
Professor Gary King from Harvard famously said that “Big Data is not about the Data”. In other words, the data itself has no immediate value. It is like crude oil that needs to be refined before it can be put to good use. This refinement of “crude” data is achieved with analytics.
This is the building block for the growth of ‘big data: we are seeing amazing advancements in the way we can analyse the data. We now have algorithms that can understand spoken words, translate them into written text and analyse this text for content, meaning and sentiment. Algorithms can now look at photos, identify who is in them and then search the Internet for other pictures of that person. More advanced algorithms emerge every day to help us understand the world of travel and predict future trends. This type of algorithm is known as “cognitive” as it emulates many of the learning algorithms the natural world uses to build biological neural networks to solve pattern recognition.
Open Source and Big Data
One of the interesting features of analytics in the last few years is that instead of secretly guarding the incredibly sophisticated algorithms and software used to implement these machine learning algorithms, organisations and individuals creating the algorithms are generally making them available as open source for everybody to use. Google is an activate participant in this trend and recently made its TensorFlow algorithm for Deep Learning available as open source. There are some good commercial reasons why they do this, but regardless, it does have the effect of accelerating progress in the area quite dramatically.
Putting it all together
The current scenario we find ourselves in is a fascinating one. The connections that the Internet enabled is creating vast quantities of high-quality data. The Cloud is providing all of the computing resources we will ever need. And cognitive algorithms can analyse data in ways that we thought would never be possible by a machine.
The convergence of three important, inter-dependent technology enablers: connections, cloud and cognitive, has the potential to usher in a new industrial revolution. What steam power did to physical labour 150 years ago, this convergence will do to cognitive labour … and our working lives and societies will change forever.
But what does all this mean for travel?
The vast majority of successful commercial applications of big data today are in the field of e-commerce and online advertising. Amazon and Google are probably the best examples of this in practice, because a very large proportion of the data collected by the apps and devices we use today reflect our preferences and life-style choices as human beings … and a very good way to monetise this data is through e-commerce and digital advertising.
Three Types of Analytics
Not all Big Data analysis involves predictive algorithms and cognitive computing. In fact, most computing analysis doesn’t. There are essentially 3 primary types of analytics applied to big data today:
- Descriptive Analytics: This is the most widespread today and typically used to segment customers. It is based on past behaviours and usually only simple metrics such as averages, totals etc. are used to provide insights.
- Predictive Analytics. Fewer organisations use this approach, mainly because it is more difficult, or large histories of good quality data are not available. With predictive analytics, the goal is usually to target individual customers based on predictions of their future behaviour.
- Prescriptive Analytics: Only a small number of very sophisticated organisations have deployed these solutions. In this scenario, very large quantities of high-quality data are required, along with powerful computing resources and cognitive / machine learning expertise. The power of prescriptive analytics is that the ideal action for each customer can be automatically taken. In this way, prescriptive analytics can either replace or augment a task that previously required a human user to manage. A great use case here is Alexa or Siri: imagine AI assistants like Alexa or Siri pro-actively engage with a customer to manage a lost baggage situation, using predictions of customer behaviour to determine the next best action.
Descriptive Analytics enjoys widespread use today and is well understood. What is more interesting is the recent trend we are seeing to use Predictive Analytics to measure the propensity consumers have to buy a basket of travel products.
This propensity model is the key to unlocking the value of Big Data in travel. Propensity predictions can support very sophisticated digital marketing campaigns that target a “segment of one” with highly customised offers and super-personalised travel experiences, to optimise conversion rates, online media spend and revenue per visit to your e-commerce site.
Unlocking the value of Big Data in Travel
In the first part of this guide, we defined what Big Data means for travel retailers. In the second part, we discussed the three types of Analytics:
- Descriptive Analytics: Analytics based on past behaviours and usually only simple metrics such as averages, totals etc. are used to provide insights.
- Predictive Analytics: Analytics with the primary goal of targeting individual customers based on predictions of their future behaviour.
- Prescriptive Analytics: Analytics using large quantities of data to point to the ideal action for each customer can be automatically taken.
As we wrap up, we will have a look at what predictive propensity models look like and how can you use propensity models.
Propensity modelling is the key to unlocking the value of Big Data in travel.
This uses sophisticated machine learning algorithms to predict what a customer is likely to do next by exploiting patterns in human behaviour. Propensity predictions support very sophisticated digital marketing campaigns that target a “segment of one” with highly customised offers and super-personalised travel experiences, to optimise conversion rates, online media spend and revenue per visit to your e-commerce site.
Example: Travel Insurance
A traveller could have a high propensity to purchase travel insurance if they are naturally risk averse and purchased insurance in the past. In many cases, a customer may not even have consciously registered that they have a propensity to purchase something, but the machine learning algorithm can predict it.
Propensity is not a new idea, but it is only with the availability of large amounts of high quality data, cloud infrastructure and machine learning that the models are starting to generate predictive signals with reliable accuracy. In particular, some of the more sophisticated retail banks and insurance companies have used this approach with great effect over the past few years. A combination of demographics and personal financial circumstances can play important roles in determining the propensity to purchase a personal loan, mortgage, investment or insurance product.
These same principles apply to travel: propensity in travel is revealed by a combination of a consumer’s demographics, their personality, their transactional history and attitudes. The best marketing and product teams in travel retailing have realised that they must embed these insights and combine these with a Big Data perspective to compete effectively in the digital world.
What do predictive propensity models look like?
Let’s start with the data sources. There are 4 main sources or dimensions to this type of propensity model, all of which relate to “offline” activity – in contrast to the online activity of a consumer that can be tracked by Google Analytics or similar platforms with tag management functionality. These dimensions are:
- Demographic information that tells us “who” this person is based on gender, age etc.;
- Transactional information that tells us “what” a person has purchased in the past as well as their estimated purchase capacity;
- Psychographic information that tells us something about “why” a person purchases things in terms of attitude and opinions. Facebook “likes” are a good example of this;
- Personality information that also tells us something about “why” a person purchases things in terms of their personality biases. This data is usually collected via surveys and can be difficult to acquire.
The data that we use across this universe varies depending on what propensity we want to predict. Generally, transactional data serves as the foundation to everything else in a propensity model. Transactional data is “hard” evidential data that tells us what buying patterns and preferences a consumer has had over time so serves to validate everything else.
Another way of looking at this or another ‘slice through this data’ is 1st party data v 3rd party data:
- 1st Party Data: Information generated from your e-commerce website, social platform and mobile web or apps about your customers. It typically consists of your customer’s personal information (name, email, addresses, phone number), demographic information (gender, age) and limited behavioural data (site interaction, purchase history, interests). It is typically stored in a CRM or web analytics system, and you, as the owner of this data have all of this free – but it can be hard to combine it all.
- 3rd Party Data: Information generated from internet interactions and other websites. This data is used to give you deeper insight about your audience, such as individual demographics (income level, marital status) and household attributes (number of children). It can be used to build consumer segments for more targeted advertising. It’s collected and licensed by third-party providers that have no direct relationship with your customers, and you have to purchase this data to access it.
Until you combine all of this data, you cannot truly claim a customer 360 view in travel.
Key Sources of Data for a Propensity Model.
Demographic, psychographic, transaction and personality data are the sources to ‘train’ a propensity mode for travel. This training process is an iterative process that uses machine learning to “parameterise” a model. These models vary depending on the exact type of propensity model being built, but typically they use logistic regression or k-means clustering for the simpler models, all the way up to support vector machines and neural networks for the more sophisticated models. Yes, advanced Big Data in travel uses some very high-end statistical models!
So what do we do with the propensity models?
There are in fact many ways that propensity models can be deployed and monetised by an e-commerce business. Any data that tells you which segments, which individuals even, are most likely to purchase your products creates a scenario where highly targeted, precision marketing can become a reality. And with this, conversion rates and revenue will improve and marketing costs will fall.
One of the more interesting areas we are working on at OpenJaw is online advertising optimisation. Here, propensity model predictions are supplied to online advertising platforms e.g. Facebook Dynamic Ads, and with this the right content can be delivered to the right person at the right time. In this scenario, we are filling gaps in Facebook’s data arsenal to optimise this advertising channel for the travel retailer. As social channels become more important for e-commerce in travel and sales and service and Chatbots move in to converse with consumers, it is easy to see how knowing more about your customers and their buying preferences will become critical to any successful travel retailer.
Deconstructing Big Data for Travel
Big Data is THE shift that will completely transform travel. It will change everything, from the way we travel to the way we interact with travel suppliers to our experience at airports. No matter what part of travel you are involved with and no matter what job you work in, Big Data will transform it. Big Data is here to stay – and it’s just starting.
Most travel businesses create a large volume of data and have access to a lot of customer information, but they don’t really know how to leverage it to make good strategic decisions. Without this foundation, adding big data into the mix often adds little value because it lacks actionable information to make effective.
We are seeing amazing advancements in the way we can analyse data: there are now have algorithms that can understand and translate spoken words, and analyse them for content, meaning and sentiment. There are algorithms can now look at photos, identify who is in them and then search the Internet for other pictures of that person. There are even advanced cognitive algorithms that emulates the learning algorithms the natural world.
Today, travel businesses haven’t really arrived at the day when Big Data can reliably tell them why customers behave in a certain way. But travellers have grown more tech savvy, their expectations for travel retail has increased. Customer expectations are soaring, but travel retailers who don’t meet the new shopping experience standard are being forced to play catch up with consumer expectations. At OpenJaw, we believe that part of ‘Big Data’ journey is to know and understand propensity modelling, as this is the key to unlocking the value of Big Data in travel.
With customers changing quickly and expecting retailers to know their needs and habits and provide them with personalised offers and experiences, the question for travel retailers isn’t whether they need to change — it’s ‘Where to start?’ While the concept of a Big Data programme of work may feel overwhelming at first, using the above insights can make it far more manageable and understandable for everybody.
This is the final part of our “Big Data Guide for Travel Brands”, so we are focusing on an important emerging area – indeed, the ‘final frontier’ for any travel retailer on a Big Data journey.
The field of prescriptive analytics takes things one step further than predictive analytics – using a combination of the customer insights provided by both descriptive analytics and predictive analytics, prescriptive analytics aims to identify the next best action for each customer to convert these insights into concrete, profitable business outcomes.
You might be thinking: ‘but we do this already’. Sure, aspects of next best action could be implemented in a CRM system or even an e-commerce platform. However, the reality is that very few organisations in any industry do next best action at scale and in a coherent, structured way. Most prescriptive analytics initiatives are either tactical in nature and focused on a single part of the customer journey, or only leverage simple business rules. These initiatives ignore the incredible insights that can be gained through more sophisticated methods like propensity modelling.
Next best action is not a new concept, nor is it only applicable to retailing or marketing. In fact, its origins lie in the military, where it is used as a framework to describe thinking quickly with distributed, local decision making versus planned campaigns and objectives (see Observe, Orient, Decide and Act (OODA) Loop). In the world of business, next best action is essentially a customer centric paradigm that identifies the next best action that should be taken for a particular customer at a specific point in time. This could be a marketing action such as an offer or promotion, or a service action to address a complaint.
Ultimately, the goal of next best action in retailing is to balance a retail customer’s product requirements and preferences against the brand’s business objectives to achieve an optimal result for both parties. Next best action is different to next best offer, an approach that prevails today. The origins of next best offer are product centric in nature and quite narrow – focused on simple up-sells or cross-sells.
In contrast, next best action takes a broader customer centric perspective to determine the appropriateness of any selling action. For example:
- Is there a complaint or servicing issue to address first?
- Is the customer loyal and open to up-sells, or at risk of churn, so in need of retention incentives?
- What propensity scores do we have on the customer that identify optimal products to up-sell or cross-sell?
- What have they searched for recently on the web-site?
- What does their online social media influencing scores look like?
All of these factors should be considered before any offer or communication with the customer occurs.
In practice, it is very difficult to implement next best action strategies as described above. A sophisticated technology infrastructure is required to implement it. So, let’s look at a methodology and Big Data tool-set to deliver this functionality. Using the Customer Journey map as a framework, we can illustrate how a library of next best actions can be created to implement the vision of prescriptive analytics for travel retailing.
A customer journey map for travel retailing
Every travel retailer will be familiar with the concept of the customer journey. Many will have developed versions of a customer journey map that identifies the steps a customer will typically follow from the very start to the very end of a trip.
At OpenJaw, we have developed a unique customer journey map that takes into account the myriad of different ways the modern traveller thinks and acts through the full customer journey. This map follows each step of a typical customer journey – all the way from ‘Inspiration & Dreaming’, through to ‘Booking’ and finally the return ‘Home’.
So what is the significance of this customer journey map for prescriptive analytics? The answer to this lies in the fact that there is always a next best action for each customer as they move through their journey. And this next best action has to balance the requirements and preferences of the customer with the retailer’s commercial objectives. The key to identifying this next best action in the world of travel retailing is to match supply (inventory) with demand (customer desire or intent).
Matching supply with demand in travel retailing
Let’s start with the demand side. The previous articles from our Big Data paper series provide a good overview of descriptive and predictive analytics in travel retailing. If we want to measure demand in a precise way i.e. at the level of individual customers that we want to sell travel products to, then the best place to start is with the customer profile information delivered by descriptive analytics, combined with the propensity to purchase information delivered by predictive analytics.
This information measures demand as it captures both the natural long-term propensity a customer has to purchase a particular travel product, as well as their short-term ‘intent’, as captured by their online searches of flight, hotel or package combinations.
The supply side is the other side of this; here, a finite inventory of hotel rooms, airplane seats, rental cars etc. is actively managed with an offer management process that uses price as the primary lever to maximise revenue. Although this established approach is widespread, it is not complete as a retailing strategy. This is because supply (inventory) is not actively or systematically matched to demand (customer desire or intent).
The principle of this approach is that revenue will be easier to maximise if the additional lever of customer demand is included as part of the offer management process. Conversion rates will increase as the offer management process is enhanced to deliver tailored offers to the right people, with the right message, at the right time.
A framework for next best action in travel retailing
So, in practical terms, how should a travel retailer attempt to match supply with demand as described above? What process or framework can be used?
Let’s take a look at a simple framework that uses the customer insights from descriptive and predictive analytics to determine the next best action at each step of the customer journey. This is best explained with a worked example.
Let’s assume we are a travel retailer and one of our customers is ‘John Smith’ who lives in Dublin, Ireland. John Smith has a history of transactions with us over the past 5 years. This history tells us many things, including that John regularly travels with his wife and 2 kids to European cities and holiday destinations. This transactional information enables the data science algorithms used by the OpenJaw Big Data platform to generate descriptive analytics e.g. using clustering and RFM (recency, frequency, monetary) methodologies, John is classified as a loyal customer.
The OpenJaw Big Data platform also supports predictive analytics. Using a series of propensity models, it identifies that John has a propensity to purchase rooms in 5 star family-friendly hotels, as well as travel brands similar to Disney and European holiday destinations.
With this customer insight, a meaningful set of next best actions can be readily identified for each step of the customer journey, as depicted in the diagram above. For example:
- Deliver personalised content for a Facebook Dynamic Ad during the inspiration & dreaming phase;
- Deliver a follow-up email with more detailed information to support research and planning;
- Execute a personalised offer once shopping commences;
- Execute an appropriate up-sell once booking commences.
What does a technology solution for this use case look like?
A technology solution to implement this framework can take many forms. However, if we focus on the first four steps of the customer journey map, then we can map out a core of the technical solution (in simplified form):
Here, we have consumers accessing a travel retailing web-site to purchase a travel product or combination of travel products . Behind the scenes an e-commerce platform is working to process inventory information from suppliers, implementing business rules for pricing and offer management.
If no Big Data capability exists for the travel retailer, then offers are generated on the basis of supply of inventory only i.e. in a product-centric way as described above, using price as the primary lever to maximise revenue.
However, if customer insights are available from a Big Data platform as depicted in the diagram, then the offer management process can be enhanced in a customer-centric way to market inventory to customers that are most likely to purchase it, as well as personalising the offer to maximise the probability of conversion.
Naturally, there are many other uses for the customer insight data that is generated by the Big Data platform. In particular, there will be internal users in marketing, revenue management and product management teams that can use this information to enhance their marketing campaigns that sit outside of the e-commerce platform, or to provide insights for product strategy.
Note too that the principles and methods of prescriptive analytics described here can be implemented in an additive way i.e. built on top existing revenue management and pricing algorithms used by travel retailers today, without significant reengineering of these processes or tools.
More information on the technical underpinnings of this diagram and the products available from OpenJaw to implement it (namely t-Data and t-Retail) can be found at www.openjawtech.com or by contacting the author.
Our goal here was to highlight how prescriptive analytics is in many respects the ‘final frontier’ for any travel retailer on a Big Data journey. It is undoubtedly a challenging area that requires a sophisticated technology infrastructure to implement, but given its potential to complete a travel retailing strategy by matching the supply of inventory with ongoing customer demand in a systematic way, it is an important field that must be taken seriously.
A second goal was to tie the pieces together from each of our four Big Data articles to illustrate how there is a natural roadmap for every travel retailer on a Big Data journey. This roadmap can be broken into 4 parts:
- It starts with building a scalable Big Data warehouse that can supply high quality integrated data for analytics;
- It then evolves to implementing descriptive analytics to make that data accessible and useful across the organisation;
- Once this foundational work is complete the challenging work of building a predictive analytics capability can commence;
- Finally, the complex domain of prescriptive analytics can be tackled.
Once the ‘final frontier’ of prescriptive analytics is reached and the principles of next best action are implemented in an e-commerce platform, the potential to truly transform and optimise a travel retailing business is reached as supply is systematically and continuously matched to customer demand.