Chapter 1 Introduction

The object of statistical methods is the reduction of data. (Ronald A. Fisher 1922, 309)

The Statistics brings together theoretical and practical tools to analyze quantitative information, measure uncertainties and help in decision making. It is a component of the Scientific Method, and can be divided according to the scheme in the Figure below. In this course, the fundamentals of topics such as exploratory data analysis, probability, sampling, inference under the classical and Bayesian prisms and linear models will be covered.

Uma possível divisão da Estatística

Figure 1.1: Uma possível divisão da Estatística

 

Exercise 1.1 See The Cartoon Guide To Statistics by (Gonick and Smith 1993). Tip from Adilson Medronha in 2024-03-15.

CRISP-DM

The workflow of a data project is typically described in a few basic steps.

  1. Understanding the problem.
  2. Understanding and preparing data.
  3. Modeling and evaluation.
  4. Implementation and communication.

CRISP-DM (Cross-Industry Standard Process for Data Mining) is an open standard which formalizes the steps described (IBM 2011).

References

Fisher, Ronald A. 1922. “On the Mathematical Foundations of Theoretical Statistics.” Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character 222 (594-604): 309–68. https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.1922.0009.
Gonick, Larry, and Woollcott Smith. 1993. The Cartoon Guide to Statistics. New York: Harper Resource. https://archive.org/details/TheCartoonGuideToStatistics/page/n3/mode/2up.
IBM. 2011. IBM SPSS Modeler CRISP-DM Guide. https://inseaddataanalytics.github.io/INSEADAnalytics/CRISP_DM.pdf.