
Introduction to R for Utilities

Peter Prevos |
610 words | 3 minutes
Share this content
Data science is the activity of creating value from data. While the mechanics of developing data products by writing code is an essential skill, data science as an activity needs to be strategic to deliver business value. This chapter develops a framework for a strategic approach to data science by creating useful, sound and aesthetic data products. Usefulness relates to a data product's ability to achieve business outcomes. A sound data product is developed using valid and reliable models, is reproducible, and, when using data about people, is ethical. Lastly, data products need to be aesthetic, not to beautify but to ensure that the outcomes of analyses are easily understood so that benefits can be realised. The book Data Science for Water Utilities introduces using R for water utilities through realistic case studies. The first chapter introduces the principles of data science and how they relate to water management.
Data Science for Water Utilities
Data Science for Water Utilities published by CRC Press is an applied, practical guide that shows water professionals how to use data science to solve urban water management problems using the R language for statistical computing.
What is Data Science?
Data science involves analyzing data in a systematic way to create value for businesses. This field combines mathematics, computer science, and domain knowledge. Some professionals in the industry believe that individuals with expertise in all three areas are rare and often referred to as "data science unicorns."
As part of my Data Science Unicorn Breeding Program, I have written a book aimed at water professionals who already possess mathematical skills and domain knowledge. I want to create the elusive data science unicorns by teaching them computer science.

See my article about strategic data science, which develops a systematic approach to creating value from data.
Principles of Strategic Data Science
Principles of Strategic Data Science helps you join the dots between mathematics, programming, and business analysis. With a unique approach that bridges the gap between mathematics and computer science, this book takes you through the entire data science pipeline.
Learning how to code
This book uses the R language for Statistical Computing. This language is specifically designed for statistical analysis. Data science often uses other languages, such as Python, SQL, and Julia. The fact is that all data scientists need to be multi-lingual.
These seven steps will help you in your journey of becoming fluent in any programming language:
- Understand the basics
- Code by hand
- Create simple programs
- Practice
- Ask for help
- Build projects
- Help others
This book introduces the basics of coding with R. Coding by hand is essential to thoroughly learning the language. Although generative AI, such as GPT or GitHub CoPilot, can write code, it will not teach you how to write code. Generative AI can assist you with finding solutions, but make sure you reverse-engineer the output so you understand the code.
R for Water Utilities
The following twelve chapters in this book introduce the R language by developing case studies:
- Water quality data: Chapters 3–6 and 12
- Customer survey: Chapters 7–10
- Digital metering: Chapters 11–12
- Concrete curing: Chapter 13
Additional Resources
The water industry produces a lot of resources on how to implement smart water networks.
- The Smart Water Networks Forum (SWAN) is the leading, global voice for the smart water sector. They have detailed resources on how to implement a smart water network.
- Fee Digital Water Book: A Strategic Digital Transformation for the Water Industry.
Other Chapters
Next Chapter: Basics of the R Language
Feel free to contact me if you have any comments, suggestions or questions about this book.
Share this content