## 计算机代写|机器学习代写Machine Learning代考|Correlated topic model

One weakness of LDA is that it cannot capture correlation between topics. For example, if a document has the “business” topic, it is reasonable to expect the “finance” topic to co-occcur. The source of the problem is the use of a Dirichlet prior for $\boldsymbol{z}n$. The problem with the Dirichlet it that it is characterized by just a mean vector $\boldsymbol{\alpha}$, but its covariance is fixed $\left(\Sigma{i j}=-\alpha_i \alpha_j\right)$, rather than being a free parameter.

One way around this is to replace the Dirichlet prior with the logistic normal distribution, which is defined as follows:
$$p(\boldsymbol{z})=\int \operatorname{Cat}(\boldsymbol{z} \mid \mathcal{S}(\boldsymbol{\epsilon})) \mathcal{N}(\boldsymbol{\epsilon} \mid \boldsymbol{\mu}, \boldsymbol{\Sigma}) d \boldsymbol{\epsilon}$$
This is known as the correlated topic model [BL07];
The difference from categorical PCA discussed in ?? is that CTM uses a logistic normal to model the mean parameters, so $\boldsymbol{z}n$ is sparse and non-negative, whereas CatPCA uses a normal to model the natural parameters, so $\boldsymbol{z}_n$ is dense and can be negative. More precisely, the CTM defines $x{n l} \sim$ Cat $\left(\mathbf{W S}\left(\boldsymbol{\epsilon}n\right)\right)$, but CatPCA defines $\boldsymbol{x}{n d} \sim \operatorname{Cat}\left(\mathcal{S}\left(\mathbf{W}_d \boldsymbol{z}_n\right)\right)$.

Fitting the CTM model is tricky, since the prior for $\boldsymbol{\epsilon}n$ is no longer conjugate to the multinomial likelihood for $c{n l}$. However, we can derive a variational mean field approximation, as described in [BL07].

Having fit the model, one can then convert $\hat{\boldsymbol{\Sigma}}$ to a sparse precision matrix $\hat{\mathbf{\Sigma}}^{-1}$ by pruning low-strength edges, to get a sparse Gaussian graphical model. This allows you to visualize the correlation between topics. Figure $28.5$ shows the result of applying this procedure to articles from Science magazine, from $1990-1999$.

## 计算机代写|机器学习代写Machine Learning代考|Dynamic topic model

In LDA, the topics (distributions over words) are assumed to be static. In some cases, it makes sense to allow these distributions to evolve smoothly over time. For example, an article might use the topic “neuroscience”, but if it was written in the $1900 \mathrm{~s}$, it is more likely to use words like “nerve”, whereas if it was written in the $2000 \mathrm{~s}$, it is more likely to use words like “calcium receptor” (this reflects the general trend of neuroscience towards molecular biology).

One way to model this is to assume the topic distributions evolve according to a Gaussian random walk, as in a state space mdoel (see ??). We can map these Gaussian vectors to probabilities via the softmax function, resulting in the following model:
\begin{aligned} \boldsymbol{w}k^t \mid \boldsymbol{w}_k^{t-1} & \sim \mathcal{N}\left(\boldsymbol{w}{t-1, k}, \sigma^2 \mathbf{1}{N_w}\right) \ \boldsymbol{z}_n^t & \sim \operatorname{Dir}\left(\alpha \mathbf{1}{N_z}\right) \ c_{n l}^t \mid \boldsymbol{z}n^t & \sim \operatorname{Cat}\left(\boldsymbol{z}_n^t\right) \ x{n l}^t \mid c_{n l}^t=k, \mathbf{W}^t & \sim \operatorname{Cat}\left(\mathcal{S}\left(\boldsymbol{w}_k^t\right)\right) \end{aligned}
This is known as a dynamic topic model [BL06a]. See Figure $28.6$ for the PGM-D.

## 计算机代写|机器学习代写Machine Learning代考|Correlated topic model

LDA 的一个弱点是它无法捕获主题之间的相关性。例如，如果文档有“商业”主题，则可以合理预期“金融”主题同时出现。问题的根 源是使用狄利克雷先验 $z n$. Dirichlet 的问题在于它的特征只是一个平均向量 $\alpha$, 但它的协方差是固定的 $\left(\Sigma i j=-\alpha_i \alpha_j\right)$ ，而不是 一个自由参数。

$$p(\boldsymbol{z})=\int \operatorname{Cat}(\boldsymbol{z} \mid \mathcal{S}(\boldsymbol{\epsilon})) \mathcal{N}(\boldsymbol{\epsilon} \mid \boldsymbol{\mu}, \boldsymbol{\Sigma}) d \boldsymbol{\epsilon}$$

