In my job as manager data science for a medium-sized water utility in Australia, I have developed a strategy to increased the amount of value we extract from data. Many businesses that seek the promised benefits of Big Data don’t achieve those because they don’t start with the basics. The most important data science strategy advice is to spend a lot of time getting to know and to improve data quality. Good data science needs to comply with these four basic principles:
- Utility: The analysis needs to be able to improve reality, otherwise we end with ‘analysis-paralysis‘. Although we speak of data science, it is really data engineering because we are not seeking the truth, we seek improvement of reality.
- Soundness: The analysis needs to be scientifically valid so that managers can make reliable decisions.Anything else leads to data pseudoscience.
- Aesthetics: Visualisations need to be pleasing to the eye, not as a beautification but to ensure users draw correct conclusions.
- Reproducibility: Analysts need to be able to repeat the work of other people to ensure quality control. This is where the science comes into data analytics.
I have recently published a paper about data science strategy for water utilities to share some of my thoughts on this topic.
Data Science Strategy for Water Utilities
The words “Big Data” have become synonymous with promises of boundless benefits. Big Data algorithms are attributed almost mystical capabilities to improve the experience of customers or optimise treatment processes and profoundly change urban water management overall.
There are some famous examples of successful companies such as Facebook, Amazon and Google, where Big Data forms part of the fabric of the enterprise. But for most organisations, including water utilities, success in this area has been limited.
The Big Data moniker is burdened with undelivered promises. As Big Data surfs the technology hype curve, the discipline of Data Science emerges as a practical way to extract more value from data. Data Science is a multidisciplinary field that combines mathematics, computing and subject matter expertise to develop actionable insights. Value from data comes from sound, useful and aesthetic information. Data products firstly need to be reliable and useful to add value and aesthetics ensure that information is communicated in a comprehensible way.
Data analytics comes naturally to the engineering and science-focused organisations that water utilities are. Given our reliance on data and technology, the benefits promised by Big Data should be within reach for water utilities.
The Data Science Continuum illustrates the value chain for business analytics. Before any value can be created, we first need to assure data quality because the law of Rubbish-in-Rubbish-Out is immutable. The Data Science Continuum is strictly hierarchical. Organisations cannot evolve to a level without first mastering the previous ones.
The full version is behind their paywall of the Australian Water Association website.