## 统计代写|抽样调查代考SURVEY SAMPLING代考|Bootstrap in Finite Population Sampling

For large samples an empirical distribution function converges uniformly with probability one to the underlying true distribution function as proved by Loeve (1977). Efron (1982) draws upon this his strength about the closeness of the simulation-based histograms calculated from SRSWR’s from observed sample-values to the unknowable underlying probability distribution. Such SRSWR’s from observed SRSWR’s from theoretical probability distributions are called ‘bootstrap’ samples. If the bootstrap samples are sufficiently numerous, studying them useful inferences are possible.

In case of sampling finite populations also bootstrap samples may be generated and put to suitable uses though no strong theoretical justifications for them are known to have been established and many results have yet remained heuristic.

Suppose $\theta_{j}=\frac{1}{N} \sum_{i=1}^{N} \xi_{j i}, j=1, \cdots, K$ are finite population totals of $K$ real variables $\xi_{j}(j=1, \cdots K)$ with values $\xi_{j i}$ for the units $i$ of a finite population $U=(1, \cdots, i, \cdots, N)$. Let $\underline{\theta}=\left(\theta_{1}, \cdots, \theta_{j}, \cdots, \theta_{K}\right)$ and $g(\underline{\theta})$ be a nonlinear but a well-bahaved function of $\underline{\theta}$ and our interest be to derive a suitable point estimator for $g(\underline{\theta})$ along with an estimator for its Mean Square Error (MSE) and also to construct a Confidence Interval (CI) for $g(\underline{\theta})$ with a suitably high confidence coefficient $100(1-\alpha) \%$ with $\alpha$ in $(0,1)$. As an alternative to (i) linearization technique, (ii) IPNS, or (iii) jack-knife the ‘sub-sampling replication’ procedure called ‘bootstrap’ sampling is often found handy. The non-linear functions $g(\underline{\theta})$ we find interesting to illustrate are:
(1) finite population correlation coefficient between $y$ and $x$ taken as
$$R_{N}=\frac{N \sum_{1}^{N} y_{i} x_{i}-\left(\sum_{1}^{N} y_{i}\right)\left(\sum_{1}^{N} x_{i}\right)}{\sqrt{N \sum_{1}^{N} y_{i}^{2}-\left(\sum_{1}^{N} y_{i}\right)^{2}} \sqrt{N \sum_{1}^{N} x_{i}^{2}-\left(\sum_{1}^{N} x_{i}\right)^{2}}}$$
with $\theta_{1}=N, \theta_{2}=\sum_{1}^{N} y_{i} x_{i}, \theta_{3}=\sum_{1}^{N} y_{i}, \theta_{4}=\sum_{1}^{N} x_{i}, \theta_{5}=\sum_{1}^{N} y_{i}^{2}$ and $\theta_{6}=\sum_{1}^{N} x_{i}^{2}$

(2) Regression coefficient of $y$ on $x$ as
$$B_{N}=\frac{N \sum_{1}^{N} y_{i} x_{i}-\left(\sum_{1}^{N} y_{i}\right)\left(\sum_{1}^{N} x_{i}\right)}{N \sum_{1}^{N} x_{i}^{2}-\left(\sum x_{i}\right)^{2}}$$
with $\theta_{1}=N, \theta_{2}=\sum_{1}^{N} y_{i} x_{i}, \theta_{3}=\sum_{1}^{N} y_{i}, \theta_{4}=\sum_{1}^{n} x_{i}, \theta_{5}=\sum_{1}^{N} x_{i}^{2}$ and a third one with $\theta_{1}=Y, \theta_{2}=X, \theta_{3}=\sum_{1}^{N} y_{i} x_{i} Q_{i} \pi_{i}$ and $\theta_{4}=\sum_{1}^{N} x_{i}^{2} Q_{i} \pi_{i}$ and $g(\underline{\theta})=\theta_{1}+\frac{\theta_{3}}{\theta_{4}}\left(\theta_{2}-X\right)=Y .$

## 统计代写|抽样调查代考SURVEY SAMPLING代考|Balanced Repeated Replication

This is a sub-sample replication variance estimating device when encountered to estimate a finite population mean, the core of the procedure is stratified sampling and its main use is in estimation of non-linear finite population parameters like correlation and regression coefficients by non-linear statistics; Chaudhuri and Stenger (2005), Chaudhuri (2010, 2014) also have given useful accounts of this procedure.

In its easiest version a population is supposed to be composed of quite a large number $H$ of strata from each of which an SRSWOR OF size $n_{h}=$ $2, h=1, \cdots, H$ is chosen to unbiasedly estimate the population mean $\bar{Y}=$ $\sum_{h=1}^{H} W_{h} \bar{Y}{h}$ with usual notations $W{h}=\frac{N_{h}}{N}, \bar{Y}{h}, h=1, \cdots, H$, the strata proportions and means. The standard estimator for this is $$\bar{y}{s t}=\sum_{h=1}^{H} W_{h} \bar{y}{h}, \bar{y}{h}=\frac{1}{2}\left(y_{h_{1}}+y_{h_{2}}\right)$$

$$V\left(\bar{y}{s t}\right)=\sum W{h}^{2} \frac{S_{h}^{2}}{n_{h}}, S_{h}^{2}=\frac{1}{\left(N_{h}-1\right)} \sum_{1}^{N_{h}}\left(y_{h_{i}}-\bar{Y}{h}\right)^{2}$$ neglecting the quantity $\frac{n{h}}{N_{h}}=\frac{2}{N_{h}}$ for every $h=1, \cdots, H$. The standard unbiased estimator for this variance is
\begin{aligned} &\nu\left(\bar{y}{s t}\right)=\sum W{h}^{2} \frac{s_{h}^{2}}{n_{h}}, \text { writing } \ &s_{h}^{2}=\frac{1}{n_{h}-1}\left[\left(y_{h_{1}}-\bar{y}{h}\right)^{2}+\left(y{h_{2}}-\bar{y}{h}\right)^{2}\right] \ &\quad=\frac{1}{2} d{h}^{2}, \text { writing } d_{h}=\left(y_{h_{1}}-y_{h_{2}}\right) \ &\nu\left(\bar{y}{s t}\right)=\frac{1}{4} \sum W{h}^{2} d_{h}^{2} . \end{aligned}

## 统计代写|抽样调查代考SURVEY SAMPLING代考|Bootstrap in Finite Population Sampling

Loeve (1977) 证明，对于大样本，经验分布函数以从概率 1 均匀收敛到䓑在的真实分布函 数。Efron (1982) 利用他的优势，即从观察到的样本值的 SRSWR 计算的基于模拟的直 方图与不可知的潜在概率分布的接近性。来自理论概率分布的观察到的 SRSWR 的此类 SRSWR 称为“引导”样本。如果 bootstrap 样本足够多，则可以通过研究它们得出有用的 推论。

（1）之间的有限总体相关系数 $y$ 和 $x$ 当作
$$R_{N}=\frac{N \sum_{1}^{N} y_{i} x_{i}-\left(\sum_{1}^{N} y_{i}\right)\left(\sum_{1}^{N} x_{i}\right)}{\sqrt{N \sum_{1}^{N} y_{i}^{2}-\left(\sum_{1}^{N} y_{i}\right)^{2}} \sqrt{N \sum_{1}^{N} x_{i}^{2}-\left(\sum_{1}^{N} x_{i}\right)^{2}}}$$

(2) 回归系数 $y$ 上 $x$ 作为
$$B_{N}=\frac{N \sum_{1}^{N} y_{i} x_{i}-\left(\sum_{1}^{N} y_{i}\right)\left(\sum_{1}^{N} x_{i}\right)}{N \sum_{1}^{N} x_{i}^{2}-\left(\sum x_{i}\right)^{2}}$$

$\theta_{1}=Y, \theta_{2}=X, \theta_{3}=\sum_{1}^{N} y_{i} x_{i} Q_{i} \pi_{i}$ 和 $\theta_{4}=\sum_{1}^{N} x_{i}^{2} Q_{i} \pi_{i}$ 和
$g(\theta)=\theta_{1}+\frac{\theta_{3}}{\theta_{4}}\left(\theta_{2}-X\right)=Y$.

## 统计代写|抽样调查代考SURVEY SAMPLING代考|Balanced Repeated Replication

Chaudhuri 和 Stenger (2005)、Chaudhuri $(2010,2014)$ 也对这一过程给出了有用的说 明。

$W h=\frac{N_{h}}{N}, \bar{Y} h, h=1, \cdots, H$ ，地层比例和均值。标准估计量是
$$\bar{y} s t=\sum_{h=1}^{H} W_{h} \bar{y} h, \bar{y} h=\frac{1}{2}\left(y_{h_{1}}+y_{h_{2}}\right)$$
$$V(\bar{y} s t)=\sum W h^{2} \frac{S_{h}^{2}}{n_{h}}, S_{h}^{2}=\frac{1}{\left(N_{h}-1\right)} \sum_{1}^{N h}\left(y_{h i}-\bar{Y} h\right)^{2}$$

$$\nu(\bar{y} s t)=\sum W h^{2} \frac{s_{h}^{2}}{n_{h}} \text {, writing } \quad s_{h}^{2}=\frac{1}{n_{h}-1}\left[\left(y_{h 1}-\bar{y} h\right)^{2}+\left(y h_{2}-\bar{y} h\right)^{2}\right] \quad=\frac{1}{2} d h^{2}, \text { writing } d_{h}=\left(y_{h 1}-y_{h 2}\right) \quad \nu(\bar{y} s t)=\frac{1}{4} \sum W h^{2} d_{h}^{2} .$$

