## 统计代写|线性回归代写Linear Regression代考|Logarithms

$$\mathrm{E}(Y \mid X=x)=\beta_0+\beta_1 x$$
a useful way to interpret the coefficient $\beta_1$ is as the first derivative of the mean function with respect to $x$,
$$\frac{d \mathrm{E}(Y \mid X=x)}{d x}=\beta_1$$
We recall from elementary geometry that the first derivative is the rate of change, or the slope of the tangent to a curve, at a point. Since the mean function for simple regression is a straight line, the slope of the tangent is the same value $\beta_1$ for any value of $x$, and $\beta_1$ completely characterizes the change in the mean when the predictor is changed for any value of $x$.

When the predictor is replaced by $\log (x)$, the mean function as a function of $x$
$$\mathrm{E}(Y \mid X=x)=\beta_0+\beta_1 \log (x)$$
is no longer a straight line, but rather it is a curve. The tangent at the point $x>0$ is
$$\frac{d \mathrm{E}(Y \mid X=x)}{d x}=\frac{\beta_1}{x}$$
The slope of the tangent is different for each $x$ and the effect of changing $x$ on $\mathrm{E}(Y \mid X=x)$ is largest for small values of $x$ and gets smaller as $x$ is increased.
When the response is in log scale, we can get similar approximate results by exponentiating both sides of the equation:
$$\begin{gathered} \mathrm{E}(\log (Y) \mid X=x)=\beta_0+\beta_1 x \ \mathrm{E}(Y \mid X=x) \approx e^{\beta_0} e^{\beta_1 x} \end{gathered}$$
Differentiating this second equation gives
$$\frac{d \mathrm{E}(Y \mid X=x)}{d x}=\beta_1 \mathrm{E}(Y \mid X=x)$$

## 统计代写|线性回归代写Linear Regression代考|EXPERIMENTATION VERSUS OBSERVATION

There are fundamentally two types of predictors that are used in a regression analysis, experimental and observational. Experimental predictors have values that are under the control of the experimenter, while for observational predictors, the values are observed rather than set. Consider, for example, a hypothetical study of factors determining the yield of a certain crop. Experimental variables might include the amount and type of fertilizers used, the spacing of plants, and the amount of irrigation, since each of these can be assigned by the investigator to the units, which are plots of land. Observational predictors might include characteristics of the plots in the study, such as drainage, exposure, soil fertility, and weather variables. All of these are beyond the control of the experimenter, yet may have important effects on the observed yields.

The primary difference between experimental and observational predictors is in the inferences we can make. From experimental data, we can often infer causation.

If we assign the level of fertilizer to plots, usually on the basis of a randomization scheme, and observe differences due to levels of fertilizer, we can infer that the fertilizer is causing the differences. Observational predictors allow weaker inferences. We might say that weather variables are associated with yield, but the causal link is not available for variables that are not under the experimenter’s control. Some experimental designs, including those that use randomization, are constructed so that the effects of observational factors can be ignored or used in analysis of covariance (see, e.g., Cox, 1958; Oehlert, 2000).

Purely observational studies that are not under the control of the analyst can only be used to predict or model the events that were observed in the data, as in the fuel consumption example. To apply observational results to predict future values, additional assumptions about the behavior of future values compared to the behavior of the existing data must be made. From a purely observational study, we cannot infer a causal relationship without additional information external to the observational study.

