The Devil is in the Data

The Devil is in the Data is a blog about practical and fun data science in the R language by Dr Peter Prevos. This website goes beyond the hype, given that 99.9% of problems can be solved without machine learning. Good data science is:

  • Useful: Increases value.
  • Sound: Valid, reliable and reproducible.
  • Aesthetic: Easy to understand.

Proudly associated with

Proudly associated with R Bloggers

Top 40 R Programming Blogs to follow in 2020

Top 20 Programming Blogs

Top 20 Programming Blogs

A Brief Guide to Providing Insights as a Service
The Insights as a Service model helps data scientists create value with data using the principles of marketing services.
Project Euler 35: Circular Primes below One Million
Solutions to Project Euler 35: How many circular primes are there below one million? Solution in the R language for statistical computing
Mapping the Ancient World: A Digital Odyssey through Ptolemy's Geography
Creating a database of Ptolemy's Geography locations and plotting on a modern map. Mapping Ptolemy's Geography with ggplot2
Analysing the Customer Experience
This chapter of the Data Science for Utilities describes the principle of analysing the customer experience and using the R language
Basic Linear Regression
This chapter of the Data Science for Utilities discusses how to undertake basic linear regression as a foundation for statistical modelling
Basics of the R Language
This chapter of the Data Science for Water Utilities book introduces the basic principles of the R language with a case study about flow in open channel flow
Clustering Customers to Define Segments
This Data Science for Water Utilities chapter implements cluster analysis to segment customers using hierarchical clustering and k-means.
Descriptive Statistics in Water Quality
This chapter of Data Science for Water Utilities teaches how to generate various types of descriptive statistics and grouped analysis with R