# 经济代写|计量经济学代写Introduction to Econometrics代考|ECON771 The choice of clusters

## 经济代写|计量经济学代写Introduction to Econometrics代考|The choice of clusters

The choice of the appropriate level of clustering is often ambiguous. In general, clusters should be defined sufficiently broad so that correlations between error terms from observations in different clusters are zero, or negligibly small. This condition becomes more plausible when there are more observations within each cluster. However, if we choose too few clusters, our standard errors may become very inaccurate. On the other hand, if we choose too many clusters and therefore allow for insufficient correlations among observations, standard errors will be biased. This is the usual biasvariance trade off that characterises many approaches in econometrics. Standard errors can thus be very different depending on whether and how observations are clustered (MacKinnon, 2019). With this in mind, Thompson (2011) argues that doubleclustering across time and firms can do more harm than good if either $T$ or $N$ is small. In particular, he advises to have at least 25 firms and 25 periods. Cameron and Miller (2015) essentially advice to cluster within any group if there is reason to believe that there is some correlation within these groups. “The consensus is to be conservative and avoid bias and to use bigger and more aggregate clusters when possible”. They also suggest to compare the cluster-robust standard errors with the default standard errors (or with clustered standard errors based on a lower level of aggregation), in the spirit of the White (1980) test. If there is a large difference, the first standard errors should be chosen. However, Abadie et al. (2017) demonstrate that clustering can substantially affect standard errors even in cases where correlations are essentially zero. They argue that “a researcher should decide whether to cluster the standard errors based on substantive information, not solely based on whether it makes a difference”. They advocate that the number of clusters in the sample should be small, relative to the number of clusters in the population, a condition that is hard to satisfy in many finance applications (using, for example, clustering across industries or countries). Along these lines, Conley et al. (2018) recommend the use of a limited number of clusters consisting of many observations, so as to accommodate the rich types of dependence encountered in real-world finance data. Ideally, this is combined with modifications to improve the small sample performance. Recently, some literature has developed deriving statistical tests to determine the optimal level of clustering. For example, Ibragimov and Müller (2016) develop a test for one-way clustering against no clustering (or a low level of clustering). More recent results are developed in MacKinnon et al. (2020).

## 经济代写|计量经济学代写Introduction to Econometrics代考|Correlation structures

To better appreciate the alternative ways of clustering, let us consider some specific examples of cross-correlations among the error terms. First, consider the case where the correlation with a cluster, say a firm, is attributable to a time-invariant firm-specific effect, that is,
$$\varepsilon_{i t}=\alpha_i+u_{i t},$$
where $u_{i t}$ is not correlated over time. Both $\alpha_i$ and $u_{i t}$ are allowed to be heteroskedastic. In this case, clustering standard errors across firms adjusts for the correlation over time due to $\alpha_i$. Standard errors will typically increase, because an additional observation on firm $i$ does not provide completely new independent information. However, the clustering across firms allows for more general forms or correlation, for example, we could have
$$u_{i t}=\rho u_{i t}+v_{i t},$$
with $\rho \neq 0$ and $v_{i t}$ uncorrelated over time. In this case, the errors are not only correlated over time due to a time-invariant component $\alpha_i$ but – decaying over time – also due to the autoregressive structure in (2.57).

## 经济代写|计量经济学代写Introduction to Econometrics代考|Correlation structures

$$\varepsilon_{i t}=\alpha_i+u_{i t},$$

$$u_{i t}=\rho u_{i t}+v_{i t},$$

