Posted on Categories:Bayesian Analysis, 统计代写, 统计代考, 贝叶斯分析

# 统计代写|贝叶斯分析代考Bayesian Analysis代写|The Dirichlet Distribution and Sparsity

## avatest™帮您通过考试

avatest™的各个学科专家已帮了学生顺利通过达上千场考试。我们保证您快速准时完成各时长和类型的考试，包括in class、take home、online、proctor。写手整理各样的资源来或按照您学校的资料教您，创造模拟试题，提供所有的问题例子，以保证您在真实考试中取得的通过率是85%以上。如果您有即将到来的每周、季考、期中或期末考试，我们都能帮助您！

•最快12小时交付

•200+ 英语母语导师

•70分以下全额退款

## 统计代写|贝叶斯分析代考Bayesian Analysis代写|The Dirichlet Distribution and Sparsity

A symmetric Dirichlet distribution (Section 2.2.1) is hyperparametrized by $\alpha>0$. It is a specific case of the Dirichlet distribution in which the hyperparameter vector of the general Dirichlet distribution contains only identical values to $\alpha$. When the hyperparameter of a symmetric Dirichlet distribution $\alpha \in \mathbb{R}$ is chosen such that $\alpha<1$, any point $x \in \mathbb{R}^K$ drawn from the respective Dirichlet will have most of its coordinates close to 0 , and only a few will have a value significantly larger than zero.

The intuition behind this property of the symmetric Dirichlet distribution can be understood when inspecting the main term in the density of the Dirichlet distribution: $\prod_{i=1}^K \theta_i^{\alpha-1}$. When $\alpha<1$, this product becomes $\frac{1}{\prod_{i=1}^K \theta_i^\beta}$ for $0<\beta=\alpha-1$. Clearly, this product becomes very large if one of the $\theta_i$ is close to 0 . If many of the $\theta_i$ are close to 0 , this effect is multiplied, which makes the product even larger. It is therefore true that most of the density for the symmetric Dirichlet with $\alpha<1$ is concentrated around points in the probability simplex where the majority of the $\theta_i$ are close to 0 .

This property of the symmetric Dirichlet has been exploited consistently in the Bayesian NLP literature. For example, Goldwater and Griffiths (2007) defined a Bayesian part-of-speech tagging with hidden Markov models (Chapter 8), in which they used a Dirichlet prior as a prior over the set of multinomials for the transition probabilities and emission probabilities in the trigram hidden Markov model.

For the first set of experiments, Goldwater and Griffiths used a fixed sparse hyperparameter for all transition probabilities and a fixed, different hyperparameter for all emission probabilities. Their findings show that choosing a small value for the transition hyperparameter (0.03) together with a choice of hyperparameter 1 for the emission probabilities achieves the best prediction accuracy of the part-of-speech tags. This means that the optimal transition multinomials are similarly likely to be very sparse. This is not surprising, since only a small number of part-of-speech tags can appear in a certain context. However, the emission hyperparameter 1 means that the Dirichlet distribution is simply a uniform distribution. The authors argued that the reason a sparse prior was not very useful for the emission probabilities is that all emission probabilities shared the same hyperparameter.

## 统计代写|贝叶斯分析代考Bayesian Analysis代写|Gamma Representation of the Dirichlet

The Dirichlet distribution has a reductive representation to the Gamma distribution. This representation does not contribute directly to better modeling, but helps to demonstrate the limi- tations of the Dirichlet distribution, and suggest alternatives to it (such as the one described in the next section).

Let $\mu_i \sim \Gamma\left(\alpha_i, 1\right)$ be $K$ i.i.d. random variables distributed according to the Gamma distribution with shape $\alpha_i>0$ and scale 1 (see also Appendix B). Then, the definition of
$$\theta_i=\frac{\mu_i}{\sum_{i=1}^K \mu_i},$$
for $i \in{1, \ldots, K}$ yields a random vector $\theta$ from the probability simplex of dimension $K-$ 1 , such that $\theta$ distributes according to the Dirichlet distribution with hyperparameters $\alpha=$ $\left(\alpha_1, \ldots, \alpha_K\right)$

The representation of the Dirichlet as independent, normalized, Gamma variables explains a limitation inherent to the Dirichlet distribution. There is no explicit parametrization of the rich structure of relationships between the coordinates of $\theta$. For example, given $i \neq j$, the ratio $\theta_i / \theta_j$, when treated as a random variable, is independent of any other ratio $\theta_k / \theta_{\ell}$ calculated from two other coordinates, $k \neq \ell$. (This is evident from Equation 3.12: the ratio $\theta_i=\theta$ is $\mu_i=\mu_j$, where all $\mu_i$ for $i \in{1, \ldots, K}$ are independent.) Therefore, the Dirichlet distribution is not a good modeling choice when the $\theta$ parameters are better modeled even with a weak degree of dependence.

# 贝叶斯分析代写

## 统计代写|贝叶斯分析代考Bayesian Analysis代写|Gamma Representation of the Dirichlet

Dirichlet 分布具有对 Gamma 分布的还原表示。这种表示不会直接有助于更好的建模，但有助于 证明 Dirichlet 分布的局限性，并提出替代方案（例如下一节中描述的）。

B) 。然后，定义
$$\theta_i=\frac{\mu_i}{\sum_{i=1}^K \mu_i}$$

Dirichlet 表示为独立的、归一化的 Gamma 变量，这解释了 Dirichlet 分布固有的局限性。坐标之 间丰富的关系结构没有明确的参数化 $\theta$. 例如，给定 $i \neq j$ ，比例 $\theta_i / \theta_j$ ，当被视为随机变量时，独立 于任何其他比率 $\theta_k / \theta_{\ell}$ 从另外两个坐标计算， $k \neq \ell$. (从公式 3.12 可以明显看出这一点: 比率 $\theta_i=\theta$ 是 $\mu_i=\mu_j$, 其中所有 $\mu_i$ 为了 $i \in 1, \ldots, K$ 是独立的。) 因此，Dirichlet 分布不是一个好 的建模选择，当 $\theta$ 即使依赖程度较弱，参数也能更好地建模

avatest.org 为您提供可靠及专业的论文代写服务以便帮助您完成您学术上的需求，让您重新掌握您的人生。我们将尽力给您提供完美的论文，并且保证质量以及准时交稿。除了承诺的奉献精神，我们的专业写手、研究人员和校对员都经过非常严格的招聘流程。所有写手都必须证明自己的分析和沟通能力以及英文水平，并通过由我们的资深研究人员和校对员组织的面试。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。