# Chapter 2 Descriptive Statistics

The exploratory data analysis, or descriptive statistics, is directly connected to the organization and description of the data. It brings together a reasonable amount of tools that can help in understanding observed values. It is used, for example, to assess how observations are distributed, where they are positioned and how they present themselves in terms of distribution and association.

In this chapter, concepts and methods of data exploration will be presented, a fundamental step for more advanced statistical analysis. For further discussion we recommend , a milestone in exploratory data analysis.

After reading this chapter, the reader should be able to interpret the following example, adapted from at the suggestion of João Brito. More details at this link (in Portuguese) from Wiki R.

# Load packages
library(skimr)
library(tidyverse)

data(starwars)

# An alternative to summary()
skimr::skim(starwars) # HTML and docx
 Name starwars Number of rows 87 Number of columns 14 _______________________ Column type frequency: character 8 list 3 numeric 3 ________________________ Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
name 0 1.00 3 21 0 87 0
hair_color 5 0.94 4 13 0 12 0
skin_color 0 1.00 3 19 0 31 0
eye_color 0 1.00 3 13 0 15 0
sex 4 0.95 4 14 0 4 0
gender 4 0.95 8 9 0 2 0
homeworld 10 0.89 4 14 0 48 0
species 4 0.95 3 14 0 37 0

Variable type: list

skim_variable n_missing complete_rate n_unique min_length max_length
films 0 1 24 1 7
vehicles 0 1 11 0 2
starships 0 1 17 0 5

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
height 6 0.93 174.36 34.77 66 167.0 180 191.0 264 ▁▁▇▅▁
mass 28 0.68 97.31 169.46 15 55.6 79 84.5 1358 ▇▁▁▁▁
birth_year 44 0.49 87.57 154.69 8 35.0 52 72.0 896 ▇▁▁▁▁
# skimr::skim_without_charts(starwars) # PDF

### References

Tukey, John W. 1977. Exploratory Data Analysis. Addison-Wesley Publishing Company.
Waring, Elin, Michael Quinn, Amelia McNamara, Eduardo Arino de la Rubia, Hao Zhu, and Shannon Ellis. 2022. Skimr: Compact and Flexible Summaries of Data. https://CRAN.R-project.org/package=skimr.