• Big Data in Travel

    Dr. John Carney
    Chief Data Scientist, OpenJaw

Dr. John Carney, Chief Data Scientist, OpenJaw Technologies at the OpenJaw Travel Summit 2017.

We are witnessing a shift that will completely transform travel. The term given to this shift is Big Data and it will change everything, from the way we travel to the way we interact with travel suppliers to our experience at airports. No matter what part of travel you are involved with and no matter what job you work in, Big Data will transform it. Dr Carney gets beyond the hype around Big Data and explains how it will become the a ‘new normal’ for travel.


The hype around Big Data and the term itself may eventually disappear but the phenomenon is here to stay – and it’s just starting. What we call Big Data today will become a ‘new normal’ and all businesses and government will have large volumes of data to improve what they do and how they do it.

To start, let’s just look at what Big Data means.

The term Big Data speaks to the fact that we can now collect and analyse data in ways and in volumes that was simply impossible only a decade ago. The field has grown rapidly since then with many businesses, especially in the Internet sector, creating completely new business models and hugely profitable enterprises based on the principles of Big Data.

Analyst firms now also actively track the market and there are formal definitions that separate Big Data solutions from traditional databases and applications. The most popular of these was provided by Gartner in 2012, which defines Big Data to be “high-volume, high-velocity and/or high-variety information assets that demand cost-effective, innovative forms of information processing”. A big data system is really only a big data system if it can process huge volumes of data, relative to other systems, such as applications, that exist at the same time. The data in the system must have a lot of variety. It can’t just be one thing. You know, a list of transactions. Must have variety. The data must move through the system with velocity, at speed, close to real time.

So why now? What’s so important about now? What’s so special about now? Why is everybody in the tech industry talking about big data, the cloud, and AI, and all of those technologies? Well, what see today is the convergence. With the internet, we now have billions of connections and every single connection generates a huge amount of data. We have photos posted to Facebook, we have messages on WeChat, we have sensor data in smart homes. The list goes on and on.

The cloud enables us to store and process this data in a very low cost, scalable, and elastic way. Cognitive computing enables us to leverage machine learning technologies to model this data, to either automate some cognitive process, or even surpass it in some cases.

I really believe that this will usher in a new industrial revolution. When you think about it, what steam power did to physical labor, these technologies, the 3 Cs, will do to cognitive labor. Let me illustrate that point with an example. Baidu’s announcement in late 2016 really surprised everybody in the tech industry. People had forgotten the pace at which technology can develop in this exponential way.

Baidu’s Deep Speech 2 algorithm can now transcribe spoken Mandarin to written Mandarin better than humans can. No human can beat it anymore. So I really think this is actually quite a critical milestone in our evolution as human beings. We can’t beat the machines anymore. We just can’t do it. I think this is a pattern we will see repeat itself in many areas, including travel retailing.

OpenJaw is participating in this revolution. We are building T-Data, which will power what we call the Digital Optimization and Transformation of your business. But how will we do this? What’s our secret sauce? What’s the magic in T-Data? The magic is a thing called propensity, or propensity modeling. So using machine learning algorithms, we will generate propensity scores for every single traveler on the t-Retail platform. These scores measure the likelihood a traveler will purchase a basket of travel products.

But let’s just talk about that word a bit. What does propensity actually mean? So let’s think a bit about today. After this keynote, I’ll have a nice lunch and I will drink a glass of wine with that nice lunch, just to help me unwind after my keynote. I always do that. Tomorrow, you will travel to the airport and if you have kids, I bet you will buy some gifts for your kids in the airport. You always do that. You have a natural propensity to do these things. So, essentially, propensity models exploit natural patterns in human behavior, essentially, for economic advantage.

Propensity is not a new idea, but really only with the convergence of connections, the cloud, and machine learning has it recently become successful in business. My background is financial services where propensity is used with great success. So much so, that 20% of sales in many retail banks today can be directly attributed to propensity models. This is a trend we’ve seen over the last three years or so in banking, in retail banking, in particular.

So how do we solve it? How does propensity modeling actually work? Well it starts with the data, obviously, and we need lots of data. We need demographic data to tell us who the customers are that we’re targeting. We need psychographic data to tell us why the customer buys things, and this is usually based on lifestyle preferences, for example. There’s proxies for this on the internet, things like Facebook likes, activity on Pinterest, and so on.

We need transactional data, and this is really important because it tells us what the customer has bought in the past. It’s also very useful to know something about your customers and your prospect’s personality. So this tell us something about why they buy things, based on particular character traits. You know, they’re extroverts, they’re neurotic, they’re outgoing, and so on. This is very useful for tailoring a marketing message. So all of this data is pushed through a propensity model which uses machine learning.

This is the data in more detail. As it happens, the most important data here is yours, and it sits on the t-Retail platform. This is the hard transactional data, the PNRs, the inventory, the fares, the ancillaries, the loyalty information, and so on, that essentially validates everything else, including online behavior. This data must be big data, spanning an industry or a large part of an industry, not just a single organization. You cannot do this well on your own, in a single organization. You just don’t have enough data. But, together, when you think about it, we do quite a unique advantage.

So how will this work in practice? Well let’s take an example, let’s take a hypothetical traveler. Let’s call him John. So T-Data will generate a continuous stream of propensity scores for all travelers on t-Retail. Now we know that John likes to take long weekends to Europe every year. He has done so for the last five years and we can see that in the transactional data. Our partners at Facebook know that he responds to inspirational advertising on his Facebook feed.

So what does propensity do? Well, propensity starts to fill some gaps. Propensity tells us that there’s an 80% probability that John will buy a 4 star hotel near a beach for his long weekend, at this point in time. It will tell us that there’s a high probability he will rent a mid-sized car on his trip, and it will tell us that there’s a very low probability, this time, he will buy travel insurance. There’s lot of complex mechanics behind the calculation of these probabilities, and they’re not based on one particular set of data, they’re based on that combination of data points. For example, and this is an extreme example, but it’s an interesting one, he may rent a mid-sized car, despite the fact he has no kids because he’s a big ego, or he has interest in cars, or something like that.

So how do we monetize this sort of data? How do we make it profitable for us? Well here’s an example in the area of online advertising optimization with Facebook dynamic ads. So we ship all of our data to Facebook, our partners, our digital advertising partners, and deliver this highly-personalized digital advertising to John, on the right platform, at the right time. With this approach, you can see that we are moving towards a segment of one in digital marketing. It’s starting to become really very sophisticated in how we tailor and target a marketing message. With this, we can optimize our media spend and also significantly improve conversion rates. Bryan Porter, tomorrow, in his talk will talk a little bit more about this in the context of the funnel and the sort of impact it can have.

In closing, then, as a data scientist, what do I think the future of travel looks like? I am somewhat biased as a data scientist, but I really truly believe that the future of travel is the future of data. We cannot really separate the two anymore. We had this convergence of connections, the cloud, and cognitive technologies, which is genuinely transformative. I’ve no doubt that this amazing progress will extend into the travel retailing sector. The work we’ve started with T-Data is a great example, I think, of how this will happen in the travel retailing sector.

So with T-Data, the future of travel will be super-personalized, it will be cognitively assisted, and optimized for the traveler, and optimized for you as travel retailers.