Monday, July 09, 2012

Three Ways To Mint Money With Big Data

Big Data is now deep into the hype phase of the innovation cycle.  All the classic signs are there: you can eat buffet dinners all 52 weeks a year at Big Data conferences, Big Data tag lines are now common in emails from industry analysts, and even investment bankers are tossing around the phrase.  Any experienced businessperson has seen this movie before with earlier technologies ranging from the World Wide Web to CRM to Enterprise Data Warehouses.  As with these other innovations, however, there is real substance at the root of the hype.  And – like CRM, the Web, and data warehouses – Big Data is very likely to be a big part of running almost any large corporation in the future.

Most early movers among the users of these prior technologies lost a lot of money, but a small number created enormous shareholder value.  By definition, all of the early movers were willing to take risks.  But three characteristics distinguished the winners from the losers.
  • First was an unwillingness to be snowed by conventional wisdom, technical jargon or the fairy tales of universal knowledge that abound when everything was still mostly talk and potential.
  • Second was a strong bias to act quickly at low cost, learn what works from experience, and then reinforce strengths.  The ultimate goal was always to exploit the opportunity to pour cash into successful innovations before the competition, but these companies recognized that trial-and-error learning usually uncovers opportunities faster than master plans.
  • Third was a ruthless focus on profits in excess of capital costs within the foreseeable future as the success criteria for proposed investments of time or money.

This article will attempt to consider the Big Data opportunity from the point of view of the P&L-owning executive.  It will keep experience of these prior technologies front of mind and will focus on one question: How can I use Big Data analytics to increase shareholder value?

Consumer-focused businesses have the latent power to exploit what is now called Big Data.  This data has now been stored and maintained to create a multi-year history that grows year-by-year.  Integrated with this are continuously updated data streams for thousands of weather monitoring devices across continents, social media data from various leading services, detailed individual and micro-geographic demographics, comprehensive business census data, and numerous other datasets.  These transaction and other datasets are growing rapidly in terms of percentage coverage of all consumer transactions, variety of data sources, data granularity, and geographic coverage.

The strategic intent has been to triangulate between business strategy, algorithmic math, and database structures to develop software tools that can change decisions to measurably increase shareholder value.  The size and complexity of this data, in combination with the focus on the creation of shareholder value through competitively superior decision-making, has required a unique process that has led to some conclusions contrary to emerging received wisdom, starting with an unconventional definition of Big Data.

What follows outlines three major analytical approaches for unlocking the latent opportunities of Big Data.  Each can be exploited now, and at least some major corporations are doing so currently.  No source of competitive advantage lasts forever, but some last longer than others.  This review proceeds from those opportunities that we believe to be the most transitory to those that we believe to be the most sustainable basis for long-term success.

1. Do It the Old-Fashioned Way: Exploit Faster Clock Speeds First

Jack Burney walks into a grocery store, shops for twenty minutes, checks out and leaves.  What stored data describes this visit that can be used to improve future decisions?

In most real-life large retailers, the data would be Mary Smith's customer ID number (from the loyalty or credit card program), the store ID number, the time of the transaction, the list of items she purchased, and the price paid (including discount codes) for each.  The retailer might also collect and maintain address, phone number, email and other contact information for Jack Burney, and might also purchase information that describes her credit score and other financial data from third-party service bureaus.  All of the transaction records for all customers for the last several years are collated to create the transaction data warehouse.  This core database is usually on the order of 1 to 100 terabytes for a Fortune 500 company.

Typically, various data "cubes" that can be queried by normal Business Intelligence (BI) tools are hived off by taking abstractions (e.g., transactions summarized to units, sales and margin by product by store by day) or subsets (e.g., all transactions in one product department for the last 12 months) of the complete database.  Major BI tools answer descriptive questions such as "What is the most common product to be bought with diapers?" or "On what day of the week do we sell the most beer in Pittsburgh?"  Cubes are created because of processing speed constraints.

Ten to fifteen years ago a database measured in terabytes was very Big Data.  It required capabilities like specialized database software, query tools, and massive IT support to maintain a whole system around this.  Companies could achieve material competitive advantage by being cleverer about how to structure the data model, design queries, and so forth.  Walmart is probably the most famous example of this.

Ten to fifteen years from now, such a transaction database will be 'small'.  In absolute terms, it will very likely be less than a factor of 10 larger than its current size, while processing and storage productivity will very likely increase by a factor of hundreds to thousands over the same period.  Consumer-sized devices and databases will be able to handle it.  The only cube required will be the database itself.  Faster algorithms, more efficient data models and the like won't matter much, because available processing power will render them insignificant (though better businesspeople will always ask smarter questions and use the answers more effectively than others).

However, this transition won't happen all at once, and a lot of money can be made over the next decade by those who manage it better than others.  The raw data storage itself has already become quite cheap.  In simplified terms, as each click forward in Moore's Law happens, the next most processing-intensive analytical method that used to require a lot of IT investment will then become executable with much cheaper IT tools, and therefore analytical processes that were previously not feasible will then become feasible with high-cost infrastructure.  Eventually, there will be no logically-definable queries that cannot be run on such a database with cheap IT tools.

In practice, we are already at the point where everything but the most complex queries for the very largest of these databases can be executed on general purpose IT tools.  This means that this opportunity is mostly about applying low-cost methods first.  In such cases, legacy systems and expertise are little help, and are often a hindrance.

2. Integrate and Use New Data Types

Numerous digital data sources are becoming available because of a combination of increasing data storage, data processing and data transmission productivity (effective available bandwidth appears to double approximately every 21 months).  Even though these data sources are 'small' when seen in isolation, they are properly considered part of Big Data because the infrastructure required to use them in practice is only now emerging and depends on increasing processing and transmission productivity. Consider as a practical example weather data.  Currently several thousand weather stations across the U.S., and proportionate numbers in other advanced countries, collect weather data and make it available on Web sites or through electronic transmission.  This can now be automatically scraped, transferred into a corporate data warehouse, and integrated with other data to provide useful information.  When the illustrative Mary Smith made her grocery store visit, the weather at that time could then be appended to her shopping record to provide useful context data on her shopping trip.  Similar data is collected and available on everything from demographics to shopping habits to traffic flow.

A truism in predictive modeling is that "better data beats better algorithms."  However, 'better' usually doesn't mean 'more data points of the same type,' but rather 'integrating a new type of relevant data into an analytical cube.'  This is one reason that a cloud services architecture is so essential to a Big Data analytics system: it makes the ongoing capture of such data economically feasible.  Systems which rely only on internally generated company data will be analytically outperformed by systems which can use this ever-expanding plethora of data.

The volume of data in one of these alternative data streams is a function of granularity (number of measurements) and intensity (bits per measurement).  Return to Mary Smith shopping at a grocery store.  Instead of just the transaction data plus weather as context, we might also capture and integrate all relevant social media postings, which would become a much larger database.  Next, we might have RFID chips embedded in shopping carts, shelves and her loyalty card that would identify the path she took through the store, what items she inspected but did not buy and so forth.  In the extreme, we might have full-motion video that shows her every motion from entry to exit.

So, how can we apply data analytics to this to create shareholder value?  A lot of attention is being paid to so-called noSQL methods that try to avoid the computation overhead of current relational databases.  Progress will surely be made here.  Just as with enterprise transaction databases over the past 10 – 15 years, there will be a constant competition between extracting abstractions from these Big databases that are more analytically tractable, building noSQL and similar technologies that allow direct analysis of the large databases themselves, and Moore's Law, which will continue to convert a given data size from Big to 'small.'  And just as with enterprise data warehouses, we will see the development of something analogous to cubes (or more broadly, pre-abstractions of data) as a large part of how these high-volume data streams are used.

What is certain about exploiting the large shareholder value opportunities available from integrating new data sources – low-volume as well as high-volume – is that the ability to flexibly and cheaply extract data from cloud services and use them is essential.  The trade-offs involved in using abstracted data in relational databases versus non-abstracted data in non-relational databases, relying on pre-abstraction versus self-abstraction of high-volume data streams, and so on are not obvious, will change as technology involves, and will depend on the company and the problem.

The thing that is clear is that today – right now – large consumer companies can begin taking advantage of many of these data streams by capturing them at an abstracted level, incorporating them in data schemas, and using them to improve decisions.  Some are already doing this.  Just as with the experience of successful innovators in earlier technology waves, smart marketers shouldn't wait to do this. When and how to move beyond these to the inevitable management of ever-larger datasets with ever-improving technology is best done by trial-and-error and reinforcement of demonstrated successes. This highlights the last and most sustainable source of competitive advantage offered by Big Data.

3. Use Test & Learn to Improve Faster
As early as 2002, business has moved into a Big Data environment, the most advantaged paradigm for the analysis and improvement of business programs migrates from model-building based on historical data to Test & Learn – in plain English, trying out new ideas in small subsets of the business, and making predictions based on these tests.  Over the past year, many industry analysts have started to recognize that Test & Learn is central to making Big Data create value.
  • McKinsey's 2011 overall report on Big Data calls out five ways for Big Data to create value.  The second is: "Enabling experimentation to discover needs, expose variability, and improve performance."
  • In 2012, a Forrester Research blog discussion of Big Data claimed that "real-world experiments will become the new development paradigm … it's clear that real-world experiments are infusing data management and advanced analytics development best practices more broadly."
  • The number one lesson called out at the 2012 MIT conference on how to exploit Big Data: "Faster insights with cheap experiments."

Big Data – and more broadly, radical reductions in the unit costs of storing, processing and transmitting data – drive this transition.  First, it has become practical to use IT to automate and semi-automate many aspects of the testing process, which lowers the cost of testing enormously.  Second, widespread sensor and other data streams allow more granular measurement of effects, and superior targeting of actions based on these tests.  Third, the explosion in data means that models built to exploit these much larger datasets are increasingly difficult to evaluate and calibrate due to their complexity, and experiments become essential to do this.

In my experience over the past decade, a Test & Learn capability for a major marketer requires a specialized analytical platform supported by a Big Data infrastructure, but also has several process and organizational components.

First, the sine qua non is executive commitment.  The person or small group with ultimate operational responsibility for shareholder value creation, typically the CEO or President, must legitimately desire reliable analytical knowledge of the business.  This implies several crucial recognitions: that intuition and experience are not sufficient to make all decisions in a way that maximizes shareholder value, that non-experimental methods are not up to the task of determining causality with sufficient reliability to guide many actions, and that experiments can be applied in practice to enough business issues to justify the costs of the capability.

Second, a distinct organizational entity, normally quite small, must be created to design experiments and then provide their canonical interpretation.  It must have analytical depth, a professional culture built around experimental learning, and an appropriate scope of interest that cuts across the various departments that will have programs subject to testing.  It should have no incentives other than scorekeeping; therefore, it should never develop program ideas, nor ever be a decision-making body.  The balance that must be struck, however, is that it should remain connected closely enough to the operational business that it does not become academic.

Third, a repeatable process must be put in place to institutionalize experimentation as a part of how the business makes decisions. This lowers the cost per test, ensures that learning is retained, and maximizes the chances of the experimental regime outlasting individual sponsors and team leaders. The orientation should not be to big, one-time "moon shot" tests, but instead toward many fast, cheap tests in rapid succession whenever this is feasible.


Technology Integration Lab
Sceda Systems
www.scedasystems.com

No comments:

Post a Comment