## 6.1 Point estimation

In point estimation a statistic is used, calculated from an estimator as a(n) (point) estimate of a certain parameter, according to Definitions 6.1 and 6.2. In other words, a single sample value (point) is used to estimate $$\theta$$, symbolized by $$\hat{\theta}$$ and read as theta hat. From the perspective of Decision Theory, an estimator is called a decision rule .

Definition 6.1 An estimator $$\hat{\theta}(\boldsymbol{x}) \equiv \hat{\theta}$$ is a function that aims to infer about a parameter $$\theta(\boldsymbol{X}) \equiv \theta$$. $$\\$$

Definition 6.2 An estimate is a particular value obtained from applying sample data to an estimator. $$\\$$

Example 6.1 The sample mean $$\bar{x}$$ given by Eq. (2.9) is a point estimator for the universal mean $$\mu$$ (Eq. (2.8)).

### 6.1.1 Unbiased estimators

Definition 6.3 An estimator is said to be unbiased according to a sampling plan $$\lambda$$ if

$\begin{equation} E_\lambda \left[ \hat{\theta} \right] = \theta. \tag{6.1} \end{equation}$

#### Sample mean $$\bar{x}$$

The sample mean in Example (2.9) is an unbiased estimator of the universal mean $$\mu$$ according to the AAS sampling plan, with or without replacement. This occurs because the expectation is linear, so the dependence between observations does not affect the result. $$\\$$

Example 6.2 Let the random variables be $$X_1, X_2, \ldots, X_n$$ independent identically distributed (iid) with $$E(X_i)=\mu$$ and an AAS type sampling plan, where for simplicity the equivalence $$E_{AAS} \equiv E$$ will be considered.

$\begin{eqnarray} E\left[\bar{X}\right] &=& E\left[\frac{1}{n} \sum_{i=1}^{n} X_i \right] \\ &=& \frac{1}{n} E\left[\sum_{i=1}^{n} X_i \right] \\ &=& \frac{1}{n} \sum_{i=1}^{n} E\left[X_i \right] \\ &=& \frac{1}{n} \sum_{i=1}^{n} \mu \\ &=& \frac{1}{n} n\mu \\ E\left[\bar{X}\right] &=& \mu. \tag{6.2} \end{eqnarray}$

Example 6.3 The universal mean of the variable age in Example 4.4 is given by $\mu = \frac{24+32+49}{3} = \frac{105}{3} = 35.$ From the Example 4.19 it can be seen that the average (expectation) of the sample means considering the AASc plan is equal to $$\mu$$, i.e., $E\left[h(\boldsymbol{X})\right] = E\left[\bar{X}\right] = \frac{24.0+28.0+36.5+28.0+32.0+40.5+36.5+40.5+49.0}{9}=\frac{315}{9}=35.$

X <- c(24,32,49)
mean(X)
##  35

From the Example 4.22 we have the vector mxc <- c(24.0,28.0,36.5,28.0,32.0,40.5,36.5,40.5,49.0).

mean(mxc)
##  35

Exercise 6.1 Check in the AASs sampling plan of Example 4.20 that $$E\left[\bar{X}\right] = \mu$$.$$\\$$

#### Sample proportion $$p$$

The sample proportion is an unbiased estimator of the universal proportion $$\pi$$ (Eq. (4.1)) according to the AAS sampling plan, with or without replacement. This estimator can be defined by \begin{align*} p = \frac{\sum_{i=1}^n x_i}{n} \tag{6.3} \end{align*}

Example 6.4 (Point estimate of the proportion) Suppose you want to calculate the point estimate for the ‘proportion of PUCRS smokers’, denoted by $$\pi$$. The characteristic of interest, or success, is that the interviewee is a ‘smoker’, for which $$x=1$$ is associated; in this way, failure is the interviewee being a ‘non-smoker’, for which $$x=0$$ is associated. In a sample of $$n = 125$$ university attendees, $$\sum_{i=1}^n x_i = 25$$ smokers were observed. The point estimate of $$\pi$$ is given by $\hat{\pi} = \dfrac{25}{125} = 0.2 = 20\%.$

#### Sample variance $$s^2$$

The sampling variance is an unbiased estimator of the universal variance $$\sigma^2$$ according to the AAS sampling plan with replacement. $$\\$$

Example 6.5 Let the independent random variables $$X_1, X_2, \ldots, ^2)=\sigma^2+\mu^2$$, $$E(\bar{X}^2)=\frac{\sigma^2}{n}+\mu^2$$ and a sampling plan of the type AASc, where for simplicity the equivalence $$E_{AASc} \equiv E$$ will be considered. See this discussion for details of $$E(\bar{X}^2)$$.

$\begin{eqnarray} E\left[S^2\right] &=& E\left[\frac{1}{n-1} \sum_{i=1}^{n} (X_{i}-\bar{X})^2 \right] \\ &=& \frac{1}{n-1} E\left[\sum_{i=1}^{n} X_{i}^2 - 2 \bar{X} \sum_{i=1}^{n} X_{i} + n\bar{X}^2 \right] \\ &=& \frac{1}{n-1} \left[\sum_{i=1}^{n} E\left[X_{i}^2\right] - E\left[n\bar{X}^2\right] \right] \\ &=& \frac{1}{n-1} \left[\sum_{i=1}^{n} E\left[X_{i}^2\right] - nE\left[\bar{X}^2\right] \right] \\ &=& \frac{1}{n-1} \left[n\sigma^2 + n\mu^2 - \sigma^2 - n\mu^2\right] \\ &=& \frac{(n-1)\sigma^2}{n-1} \\ E\left[S^2\right] &=& \sigma^2 \tag{6.2} \end{eqnarray}$

Exercise 6.2 Check in the SRSwi sampling plan of Example 4.19 if $$E_{SRSwi}\left[S^2\right] = \sigma^2$$. $$\\$$

Exercise 6.3 Check in the SRSwo sampling plan of Example 4.20 if $$E_{SRSwo}\left[S^2\right] = \sigma^2$$. $$\\$$

#### Median

show that the median according to Eq. (2.17) is the most bias-resistant estimator in the class of L-statistics with non-negative coefficients that sum to one, for a class of distributions that includes the normal, double exponential and logistics.

### 6.1.2 Maximum likelihood estimators

A maximum likelihood estimator is one that proposes the estimation of $$\theta$$ by $$\hat{\theta}$$, a value that maximizes the likelihood function according to Definition 5.3. According to , the maximum likelihood method was first used by Johann Heinrich Lambert and Daniel Bernoulli in the mid-1760s, but detailed by Ronald Fisher in the early 1920s.

Example 6.6 Adapted from . Let $$X_1, \ldots, X_n$$ a sequence (conditionally) iid $$\mathcal{Ber}(\theta) \equiv \mathcal{B}(1,\theta)$$. The likelihood function is $\begin{eqnarray} L(\theta|x) &=& \Pi_{i=1}^n {1 \choose x_i} \theta^{x_i} (1-\theta)^{1-x_i} \nonumber \\ &=& \theta^{s} (1-\theta)^{n - s}, \end{eqnarray}$ where $$s=\sum_{i=1}^{n} x_i$$. If we take the logarithm in the natural base of $$L(\theta|x)$$, we have from the properties of logarithms that $\begin{eqnarray} l(\theta|x) &=& \log(\theta^{s} (1-\theta)^{n - s}) \nonumber \\ &=& s \log(\theta) + (n-s)\log(1-\theta). \end{eqnarray}$ Using principles of Calculus it is possible to derive $$l(\theta|x)$$ in relation to $$\theta$$ and equate it to zero, from which we obtain the maximum likelihood estimation $\begin{eqnarray} \frac{s}{\hat{\theta}} - \frac{n-s}{1-\hat{\theta}} &=& 0 \;\; \therefore \;\; \hat{\theta} &=& \frac{s}{n} \end{eqnarray}$

Exercise 6.4 Consider the information in Example 6.6.
a. Show, from the definition, that $$L(\theta | x) = \theta^{s} (1-\theta)^{n-s}$$, $$s=\sum_{i=1}^{n} x_i$$.
b. Show, applying the principles of Calculus, that $$\hat{\theta} = \frac{1}{n} \sum_{i=1}^{n} x_i$$.

$$\\$$

### References

Barnett, Vic. 1999. Comparative Statistical Inference. John Wiley & Sons. https://onlinelibrary.wiley.com/doi/book/10.1002/9780470316955.
Berger, James O. 1985. Statistical Decision Theory and Bayesian Analysis. 2nd ed. Springer Science & Business Media. https://www.springer.com/gp/book/9780387960982.
Casella, George, and Roger L Berger. 2002. Statistical Inference. Duxbury - Thompson Learning.
David, HA, and JK Ghosh. 1985. “The Effect of an Outlier on l-Estimators of Location in Symmetric Distributions.” Biometrika 72 (1): 216–18. https://www.jstor.org/stable/2336355.