This chapter of the Data Science for Water Utilities book introduces the basic principles of the R language with a case study about flow in open channel flow

Basics of the R Language

Peter Prevos

Peter Prevos |

863 words | 5 minutes

Share this content

The second chapter of Data Science for Water Utilities introduces the basic principles of the R language and RStudio to analyse and visualise data. Writing code is not only about using the correct syntax and the appropriate functions; this chapter also covers some issues regarding styling your code, making it easy to understand. This chapter concludes with a case study to measure water flow in an open channel using the Kindsvater-Carter formula. The learning objectives for this chapter are:

  • Install R and RStudio and identify the different parts of the RStudio screen.
  • Understand the principles of writing code to analyse data.
  • Apply R code to solve a simple water problem.

    Data Science for Water Utilities

    Data Science for Water Utilities

    Data Science for Water Utilities published by CRC Press is an applied, practical guide that shows water professionals how to use data science to solve urban water management problems using the R language for statistical computing.

The data and code used in this chapter are available on GitHub:

Installing R and Rstudio

The best way to use R is through an Integrated Development Environment (IDE). This type of software helps you to write and manage code. An IDE typically comprises a source code editor, automation tools, and functionality to simplify crafting and running code. Of course, you can use R without the IDE, but it will be less user-friendly.

Several IDEs are available to help you write R code. RStudio by Posit is the most popular option.

This software is also an open-source project, with free and paid versions for companies that want to use advanced features and support services. RStudio can also work with other languages, such as SQL and Python. Follow these steps to install the required software:

  1. Go to the R Project website: cran.r-project.org
  2. Download the base version for your operating system and install the software
  3. Go to the download page on the RStudio website: posit.co
  4. Download the installer for the free desktop version and install the software

Alternatively, you can sign up for a free and fully featured account to access RStudio’s cloud version (posit.cloud). This service gives you full access to R and RStudio in your browser without installing software. The free version provides enough hours of computing time to work through this book. You’ll have to pay for a subscription or install the desktop version if you need more time.

The RStudio IDE interface
The RStudio IDE interface.

Other R IDEs

Principles of Writing Code

Writing code has a steep learning curve and can be both frustrating and rewarding at the same time. Here are some principles that will help you write good code that is easy to understand:

  1. Use descriptive variable names. Don’t, for example, use d, but pipe-diameter or another descriptive name.
  2. Add plenty of whitespace in your code.
  3. Use comments starting with a hashtag: # (comments are not evaluated)
  4. Follow a coding style guide, for example, the Tidyverse Style Guide.
The R Learning Curve can be steep and full of surprises
The R Learning Curve can be steep and full of surprises.

R is Meme Proof

The basic arithmetic in the R language follows the BODMAS rules (Brackets, Order, Division, Multiplication, Addition, and Subtraction). R is thus meme-proof and can solve the silly mathematical ‘challenges’ spread on social media.

R is meme-proof and can solve arithmetic challenges
R is meme-proof and can solve arithmetic challenges.

Channel Flow Case Study

Determining the flow in an open channel is usually achieved by measuring the water depth through a section with a known shape. Forcing the water through a given shape over a sharp weir creates a boundary condition that allows us to calculate the flow. A mathematical relationship determines the volume of water that passes through the channel. This case study uses a rectangular weir. A simplified version of the Kindsvater-Carter formula:

$$Q = \frac{2}{3} C_d \sqrt{2g} \\ bh^{\frac{3}{2}}$$

  • $Q$: Flow rate ($m^3/s$)
  • $C_d$ : Discharge coefficient
  • $g$: Gravitational acceleration (9.81 $m/s^2$ )
  • $b$: Measured width of the notch [$m$]
  • $h$: Upstream head above crest level [$m$]
Rectangular weir
Rectangular weir.

Questions:

  • What is the flow in $m^3/s$ when $h=100$
  • What is the average flow when the level above the weir is 150, 136 and 75mm?

Cheat Sheet

To help you remember the various functions discussed in the first five chapters of the book, a cheat sheet is available.

R4H2O Cheat Sheet
Data Science for Water Utilities Cheat Sheet.

Basics of the R Language Screencast

Chapter two of Data Science for Water Utilities explains the principles of writing code to analyse data in more detail. This screencast runs through the code in this chapter.

Basics of the R Language.

The data and code used in this chapter are available on GitHub:

Additional Resources

To help you remember the various functions discussed in the first five chapters of the book, a cheat sheet is available.

Project Euler provides an excellent way to practice solving basic mathematical methods in the R language.

Other Chapters

Previous Chapter: Introduction to Data Science for Water Utilities

Next Chapter: Loading and Exploring Water Quality Data

Feel free to contact me if you have any comments, suggestions or questions about this book.

Share this content

You might also enjoy reading these articles

Analysing the Customer Experience

Basic Linear Regression

Clustering Customers to Define Segments