Think:Act Magazine "Performance: Faster, Higher, Stronger "
Try, try again
Embracing a new culture of exploration, trial and error with a view to business benefits
by Geoff Poulton
Photos by Pelle Cass
Read more on the topic “Performance”
Taking a cue from science and exploring experimentation could open up exciting new prospects and even profits. But how do you organize a company to embrace a new culture of trial and error?
From Archimedes jumping up to shout "eureka" to Fleming's discovery of penicillin and Mendel uncovering genetics – the history of science is full of experiments that have demonstrated our most fundamental laws and vital innovations. Without these experiments, some may have gone undiscovered or simply dismissed as nothing more than speculative theory.
And yet, too often, hunches, theories and opinions are the driving forces behind important business decisions about products, employees, resources or customers. This overreliance on intuition over evidence gained from rigorous experiments is holding us back; it limits performance and restricts innovation. As acclaimed management thinker Gary Hamel says: "The way you create the future is not to predict it but to find it. And you find it through experimentation." It's a notion echoed by Stefan Thomke, professor of business administration at Harvard Business School and author of Experimentation Works: The Surprising Power of Business Experiments. Instead of relying on experience or beliefs, he says, it's time for business leaders to start thinking and acting like scientists.
For centuries, we've built and organized scientific and technological knowledge through testable explanations and predictions, which has powered innovation. This approach, known as the scientific method, dates back to the 16th century and the likes of Francis Bacon, Galileo and Isaac Newton. The method has six steps: It starts with a question, leading to research that helps form a hypothesis, which is tested with an experiment, followed by an analysis of the results and then a conclusion.
Adopting these principles and implementing large-scale experimentation can "revolutionize" the way all companies operate and how managers make decisions, says Thomke. Most are yet to fully realize this, but a growing number are embracing its efficacy. Online travel platform Booking.com, for instance, runs more than 1,000 experiments simultaneously and tens of thousands each year. The same goes for Facebook, Microsoft and Netflix as well as non-tech firms like P&G, Gap and Nike. Experiments are a crucial way of exposing judgmental errors, something to which we are all susceptible, whether due to cognitive biases or a lack of information. "Acting like a scientist is difficult for leaders because it can challenge their humility," Thomke says. Instead of personal insight, it focuses on an objective, evidence-based process to frame and address decisions.
The harsh reality is that most experiments will fail – something plenty of organizations struggle to accept, says Ron Kohavi, who has run large-scale experimentation at Microsoft, Amazon and Airbnb, and is the author of Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing. "When I came to Microsoft and pitched the idea of A/B tests, I told them that at Amazon, more than 50% of ideas tested failed to move the metrics they were designed to improve. The response was: 'We have better program managers,'" Kohavi says. "Several years later, we knew that two-thirds of ideas at Microsoft evaluated in A/B tests failed to move the metrics they were designed to improve. At Bing, a mature domain, that rate was about 85%."
If that's the case, why bother experimenting at all? Kohavi tells the story of a Microsoft employee working on its Bing search engine who had an idea about changing the way it displayed ad headlines. Considered a low priority, the idea lay dormant for months before finally being tested. After a few hours, results showed the change had boosted revenue by an impressive 12% without impacting user experience – a potential annual gain of more than $100 million in the United States alone.
Even if they 'fail,' experiments often still prove valuable. Experimentation is an iterative process – failure will often lead to insight, which will enable the experimenter to modify the hypothesis, adjust the experiment and try again. To do this effectively, an organization needs a certain type of culture as well as organizational and technical infrastructure. "Companies need to make experimentation an integral part of everyday life," says Thomke. That means creating an environment to nurture curiosity, prioritize data over opinion and democratize accessibility to testing.
This article features the work of artist-photographer Pelle Cass from his Crowded Fields collection. Cass says of his work, "My complicated, chaotic compositions in Crowded Fields reflect my own feelings of turmoil and confusion. The world is a confusing place, and most of the time, I don't know which way is up. So perhaps this feeling informs my compositional goals."
One company that does this better than almost any other is Booking.com. Behind the many thousands of experiments and billions of simultaneous landing page permutations is a corporate culture that embraces transparency and accepts failure. Around three-quarters of Booking's 1,800 technology and product staff regularly use the company's internal experimentation platform, where ideas are discussed, results analyzed and hypotheses scrutinized.
As the former director of experimentation at Booking, Lukas Vermeer said in 2019: "We don't just experiment because we like running experiments, but because experimentation is a great way to make sure that when we think we're fixing something, we're actually fixing it. Change is constant, we have to keep updating our products to make them better, but we also have to make sure those changes really work." Which they typically do. After a Covid-19-induced downturn, Booking returned to posting record revenues in 2022.
There are three main models for any organization seeking to embrace experimentation according to Thomke, each with their own pros and cons. A centralized setup can focus on long-term projects, but may encounter conflicting priorities among business units and feel disconnected from a company's daily business. In a decentralized model, experimentation experts can immerse themselves in a particular business area, although this may impact their own development and limit peer feedback. A center of excellence model combines the two, but any organization following this approach should be wary of confusion about where responsibilities lie between the central unit and product teams.
The latter is a fundamental part of Netflix's approach to experimentation. The streaming giant uses A/B experiments to improve performance in a wide range of areas, including payment processes, advertising, streaming infrastructure and user recommendations and messaging. Even the artwork for each film and TV show is rigorously tested, sometimes resulting in 20-30% more viewing for a particular title. It's incremental gains like these, made by running large numbers of experiments, that are at the heart of improved performance for most organizations. "People tend to glorify big, disruptive ideas, but in reality, most progress comes from implementing hundreds or thousands of minor improvements," says Thomke.
While it may be easier for online firms to conduct large volumes of experiments, both Kohavi and Thomke say almost every organization should be looking to implement experimentation into its daily business. "Whether you're a bank, a grocery store or a health care provider, it's becoming easier and more affordable to run tests on many different areas," Kohavi says. Supermarkets like Walmart in the US or Sainsbury's in the UK regularly experiment with things like store layout, marketing, offers, or even the use of new technology such as till-free stores. One experiment at Gap stores found that bringing more stability to employee schedules improved both sales and productivity.
Does all of this mean we should throw human opinion, experience and judgment out the window when it comes to decision-making? Not exactly. As Ron Kohavi points out, technology may make it easier to experiment, but software can easily spit out wrong results, "something I've seen over and over and over again. We always need to question results and double-check things." And, ultimately, it's people who make decisions – even at an experiment-led company like Netflix. For every significant decision, a single 'informed captain' makes a judgment call based on input from colleagues and, of course, any relevant data – test results are expected to play a part in the process, wherever possible.
Stefan Thomke calls experiments the "engine of innovation," which is becoming increasingly powerful thanks to technology. Taking advantage of this requires a different kind of organization, one that takes a more scientific approach to decision-making. As technology-driven companies embrace the power of experimentation more openly, the gap between good experimenters and those who aren't is growing, he says. "And we need to figure out how to get organizations who don't practice it to do it more."