9.3 Formulações
Baseado em (Wilkinson and Rogers 1973), (John M. Chambers and Hastie 1993) apresentam as possíveis formulações da linguagem R.
i | Expressão | Significado |
---|---|---|
1 | \(T \sim F\) | \(T\) é modelado com \(F\) |
2 | \(F_a + F_b\) | Inclui \(F_a\) e \(F_b\) |
3 | \(F_a - F_b\) | Inclui todos os \(F_a\) exceto o que está em \(F_b\) |
4 | \(F_a * F_b\) | \(F_a + F_b + F_a:F_b\) |
5 | \(F_a / F_b\) | \(F_a + F_b\) %in% \((F_a)\) |
6 | \(F_a : F_b\) ou \(F_b\) %in% \(F_a\) |
O fator indexado conjuntamente por \(F_a\) e \(F_b\) |
7 | \(F^m\) | Todos os termos de \(F\) cruzados até a ordem \(m\) |
8 | \(T \sim \; .\) | \(T\) é modelada com todas as variáveis (exceto \(T\)) |
Exemplo 9.1 Formulação \(T \sim F\) com o banco de dados cars
.
##
## Call:
## lm(formula = dist ~ speed, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
Exemplo 9.2 Formulação \(F_a + F_b\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ Ozone + Solar.R + Wind + as.factor(Month),
## data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.9220 -3.0386 0.0148 3.0856 12.0292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 64.247098 2.722613 23.598 < 2e-16 ***
## Ozone 0.121180 0.022020 5.503 2.74e-07 ***
## Solar.R 0.011901 0.006044 1.969 0.0517 .
## Wind -0.250226 0.183341 -1.365 0.1753
## as.factor(Month)6 11.261885 2.069698 5.441 3.59e-07 ***
## as.factor(Month)7 12.031054 1.613653 7.456 2.90e-11 ***
## as.factor(Month)8 12.335145 1.680223 7.341 5.08e-11 ***
## as.factor(Month)9 9.358031 1.473927 6.349 5.93e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.268 on 103 degrees of freedom
## (42 observations deleted due to missingness)
## Multiple R-squared: 0.7139, Adjusted R-squared: 0.6944
## F-statistic: 36.71 on 7 and 103 DF, p-value: < 2.2e-16
Exemplo 9.3 Formulações \(F_a \pm F_b\) e \(T \sim \; .\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ . - Month + as.factor(Month) - Day, data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -17.9220 -3.0386 0.0148 3.0856 12.0292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 64.247098 2.722613 23.598 < 2e-16 ***
## Ozone 0.121180 0.022020 5.503 2.74e-07 ***
## Solar.R 0.011901 0.006044 1.969 0.0517 .
## Wind -0.250226 0.183341 -1.365 0.1753
## as.factor(Month)6 11.261885 2.069698 5.441 3.59e-07 ***
## as.factor(Month)7 12.031054 1.613653 7.456 2.90e-11 ***
## as.factor(Month)8 12.335145 1.680223 7.341 5.08e-11 ***
## as.factor(Month)9 9.358031 1.473927 6.349 5.93e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.268 on 103 degrees of freedom
## (42 observations deleted due to missingness)
## Multiple R-squared: 0.7139, Adjusted R-squared: 0.6944
## F-statistic: 36.71 on 7 and 103 DF, p-value: < 2.2e-16
Exemplo 9.4 Formulação \(F_a * F_b\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ Ozone * as.factor(Month), data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.8133 -3.1431 0.3708 2.8843 11.4275
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 62.88422 1.43129 43.935 < 2e-16 ***
## Ozone 0.16288 0.04454 3.657 0.000399 ***
## as.factor(Month)6 6.86610 3.57469 1.921 0.057450 .
## as.factor(Month)7 15.00550 2.53227 5.926 3.92e-08 ***
## as.factor(Month)8 15.05456 2.28654 6.584 1.80e-09 ***
## as.factor(Month)9 4.83879 2.09236 2.313 0.022676 *
## Ozone:as.factor(Month)6 0.12484 0.10593 1.179 0.241209
## Ozone:as.factor(Month)7 -0.06147 0.05443 -1.129 0.261308
## Ozone:as.factor(Month)8 -0.06244 0.05105 -1.223 0.224012
## Ozone:as.factor(Month)9 0.12882 0.05903 2.182 0.031309 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.949 on 106 degrees of freedom
## (37 observations deleted due to missingness)
## Multiple R-squared: 0.749, Adjusted R-squared: 0.7277
## F-statistic: 35.15 on 9 and 106 DF, p-value: < 2.2e-16
Exemplo 9.5 Formulação \(F_a/F_b\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ Ozone/as.factor(Month), data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.6323 -2.9636 0.9709 4.0807 12.2975
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 69.890925 0.988360 70.714 < 2e-16 ***
## Ozone 0.002643 0.043547 0.061 0.951704
## Ozone:as.factor(Month)6 0.281518 0.070349 4.002 0.000114 ***
## Ozone:as.factor(Month)7 0.204859 0.042385 4.833 4.38e-06 ***
## Ozone:as.factor(Month)8 0.192246 0.042267 4.548 1.40e-05 ***
## Ozone:as.factor(Month)9 0.245123 0.047102 5.204 9.13e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.099 on 110 degrees of freedom
## (37 observations deleted due to missingness)
## Multiple R-squared: 0.6046, Adjusted R-squared: 0.5866
## F-statistic: 33.64 on 5 and 110 DF, p-value: < 2.2e-16
Exemplo 9.6 Formulação \(F_a:F_b\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ Ozone:as.factor(Month), data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -21.6323 -2.9636 0.9709 4.0807 12.2975
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 69.890925 0.988360 70.714 < 2e-16 ***
## Ozone:as.factor(Month)5 0.002643 0.043547 0.061 0.952
## Ozone:as.factor(Month)6 0.284161 0.064693 4.392 2.59e-05 ***
## Ozone:as.factor(Month)7 0.207503 0.022200 9.347 1.23e-15 ***
## Ozone:as.factor(Month)8 0.194889 0.020360 9.572 3.74e-16 ***
## Ozone:as.factor(Month)9 0.247766 0.035040 7.071 1.50e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.099 on 110 degrees of freedom
## (37 observations deleted due to missingness)
## Multiple R-squared: 0.6046, Adjusted R-squared: 0.5866
## F-statistic: 33.64 on 5 and 110 DF, p-value: < 2.2e-16
Exemplo 9.7 Formulação \(F^m\) com o banco de dados airquality
.
##
## Call:
## lm(formula = Temp ~ Ozone^3, data = airquality)
##
## Residuals:
## Min 1Q Median 3Q Max
## -22.147 -4.858 1.828 4.342 12.328
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 69.41072 1.02971 67.41 <2e-16 ***
## Ozone 0.20081 0.01928 10.42 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.819 on 114 degrees of freedom
## (37 observations deleted due to missingness)
## Multiple R-squared: 0.4877, Adjusted R-squared: 0.4832
## F-statistic: 108.5 on 1 and 114 DF, p-value: < 2.2e-16
Exercício 9.1 Veja as seguintes documentações.
stats::formula
base::tilde
base::I
stats::offset
References
Chambers, John M., and Trevor J. Hastie. 1993. Statistical Models in S. Chapman & Hall, London.
Wilkinson, GN, and CE Rogers. 1973. “Symbolic Description of Factorial Models for Analysis of Variance.” Journal of the Royal Statistical Society Series C: Applied Statistics 22 (3): 392–99. https://www.jstor.org/stable/2346786.