The Devil is in the Data

The Devil is in the Data is a blog about practical and fun data science in the R language by Dr Peter Prevos. This website goes beyond the hype, given that 99.9% of problems can be solved without machine learning. Good data science is:

  • Useful: Increases value.
  • Sound: Valid, reliable and reproducible.
  • Aesthetic: Easy to understand.

Proudly associated with

Proudly associated with R Bloggers

Top 40 R Programming Blogs to follow in 2020

Top 20 Programming Blogs

Top 20 Programming Blogs

Detecting Outliers and Anomalies in Time Series Data
This chapter of Data Science for Utilities discusses detecting outliers and anomalies in time series data using, including leak detection
Introduction to Machine Learning
This chapter of the Data Science for Utilities is an introduction to machine learning using multiple linear regression and decision trees.
Introduction to R for Utilities
This chapter of the Data Science for Utilities introduces the basic principles of using R for water utilities and advice on how to learn to code
Loading and Exploring Water Quality Data from Spreadsheets
This chapter of the Data Science for Water Utilities teaches how to load and explore water quality data in CSV files and spreadsheets
Managing and Cleaning Dirty Data
This chapter of the Data Science for Utilities describes managing dirty data and cleaning it with a script using Tidyverse functionality
Sharing the Results of Data Analysis with R Markdown
This chapter of Data Science for Utilities discusses sharing the results of data analysis using RMarkdown to create a PowerPoint
Visualising Data with ggplot2
This chapter of Data Science for Utilities course teaches visualising data with ggplot2 using a water quality case study
Working with Dates and Times in R
This chapter of Data Science for Utilities teaches how to work with dates and times in R and analyses digital metering data