Exploratory Data Analysis

Core themes

Exploratory Data Analysis

“Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.” — John Tukey

  • Generate questions about your data.

  • Search for answers by visualizing, transforming, and modelling your data.

  • Use what you learn to refine your questions and/or generate new questions.

Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns,to spot anomalies,to test hypothesis and to check assumptions with the help of summary statistics and graphical representations.

Exploratory data analysis is structured curiosity

  • Variation and co-variation: What patterns emerge?

    • What varies? Why?
  • What’s surprising? Why?

  • What might the data miss?

    • What might be wrong? Why?
  • Why does it matter who collects and interprets the data?

EDA is…

  • Iterative, not linear

  • About asking questions, and using data to identify new questions

  • About learning the data, not producing results


Learn more: What is EDA?