# 计算机代写|机器学习代写Machine Learning代考|Constraint-based approach

## 计算机代写|机器学习代写Machine Learning代考|IC algorithm

The original algorithm, due to Verma and Pearl [VP90], was called the IC algorithm, which stands for “inductive causation”. The method is as follows [Pea09, p50]:

1. For each pair of variables $a$ and $b$, search for a set $S_{a b}$ such that $a \perp b \mid S_{a b}$. Construct an undirected graph such that $a$ and $b$ are connected iff no such set $S_{a b}$ can be found (i.e., they cannot be made conditionally independent).
2. Orient the edges involved in $\mathrm{v}$-structures as follows: for each pair of nonadjacent nodes $a$ and $b$ with a common neighbor $c$, check if $c \in S_{a b}$; if it is, the corresponding DAG must be $a \rightarrow c \rightarrow b, a \leftarrow c \rightarrow b$ or $a \leftarrow c \leftarrow b$, so we cannot determine the direction; if it is not, the DAG must be $a \rightarrow c \leftarrow b$, so add these arrows to the graph.
3. In the partially directed graph that results, orient as many of the undirected edges as possible, subject to two conditions: (1) the orientation should not create a new v-structure (since that would have been detected already if it existed), and (2) the orientation should not create a directed cycle. More precisely, follow the rules shown in Figure 31.8. In the first case, if $X \rightarrow Y$ has a known orientation, but $Y-Z$ is unknown, then we must have $Y \rightarrow Z$, otherwise we would have created a new v-structure $X \rightarrow Y \leftarrow Z$, which is not allowed. The other two cases follow similar reasoning.

## 计算机代写|机器学习代写Machine Learning代考|PC algorithm

A significant speedup of $\mathrm{IC}$, known as the $\mathbf{P C}$ algorithm after is creators Peter Spirtes and Clark Glymour [SG91], can be obtained by ordering the search for separating sets in step 1 in terms of sets of increasing cardinality. We start with a fully connected graph, and then look for sets $S_{a b}$ of size 0 , then of size 1 , and so on; as soon we find a separating set, we remove the corresponding edge. See Figure 31.9 for an example.
Another variant on the $\mathrm{PC}$ algorithm is to learn the original undirected structure (i.e., the Markov blanket of each node) using generic variable selection techniques instead of CI tests. This tends to be more robust, since it avoids issues of statisical significance that can arise with independence tests. See [PE08] for details.
The running time of the $\mathrm{PC}$ algorithm is $O\left(D^{K+1}\right)$ [SGS00, p85], where $D$ is the number of nodes and $K$ is the maximal degree (number of neighbors) of any node in the corresponding undirected graph.

The IC/PC algorithm relies on an oracle that can test for conditional independence between any set of variables, $A \perp B \mid C$. This can be approximated using hypothesis testing methods applied to a finite data set, such as chi-squared tests for discrete data. However, such methods work poorly with small sample sizes, and can run into problems with multiple testing (since so many hypotheses are being compared). In addition, errors made at any given step can lead to an incorrect final result, as erroneous constraints get propagated. In practice it is a common to use a hybrid approach, where we use IC/PC to create an initial structure, and then use this to speed up Bayesian model selection, which tends to be more robust, since it avoids any hard decisions about conditional independence or lack thereof.

## 计算机代写|机器学习代写Machine Learning代考|IC algorithm

1. 对于每对变量 $a$ 和 $b$, 搜索一个集合 $S_{a b}$ 这样 $a \perp b \mid S_{a b}$. 构造一个无向图使得 $a$ 和 $b$ 连接当且仅当没有这 样的集合 $S_{a b}$ 可以找到（即，它们不能有条件地独立）。
2. 定向涉及的边缘v-结构如下: 对于每对不相邻的节点 $a$ 和 $b$ 和一个共同的邻居 $c$, 检查是否 $c \in S_{a b}$; 如果 是，则相应的 DAG 必须是 $a \rightarrow c \rightarrow b, a \leftarrow c \rightarrow b$ 或者 $a \leftarrow c \leftarrow b$ ，所以我们无法确定方向；如果 不是，则 DAG 必须是 $a \rightarrow c \leftarrow b$ ，因此将这些箭头添加到图形中。
3. 在产生的部分有向图中，根据两个条件㞔可能多地定向无向边：(1) 定向不应创建新的 $v$ 结构 (因为如果 它存在，它已经被检测到)，以及(2)定向不宜形成定向循环。更准确地说，㘏循图 31.8 中所示的规 则。在第一种情况下，如果 $X \rightarrow Y$ 有一个已知的方向，但是 $Y-Z$ 是末知的，那么我们必须有 $Y \rightarrow Z$, 否则我们会创建一个新的 $\mathrm{v}$ 结构 $X \rightarrow Y \leftarrow Z$, 这是不允许的。其他两个案例邅循类似的推 理。

## 计算机代写|机器学习代写Machine Learning代考|PC algorithm

$\mathrm{IC} / \mathrm{PC}$ 算法依赖于一个可以测试任何变量集之间的条件独立性的预言机， $A \perp B \mid C$. 这可以使用 应用于有限数据集的假设检验方法来近似，例如离散数据的卡方检验。然而，这种方法在样本量较小 的情况下效果不佳，并且可能会遇到多重测试的问题（因为要比较很多假设）。此外，在任何给定步 骤中犯的错误都可能导致不正确的最终结果，因为错误的约束会传播。在实践中，通常使用混合方 法，我们使用 IC/PC 创建初始结构，然后使用它来加速贝叶斯模型选择，这种方法往往更稳健，因 为它避免了关于条件的任何艰难决策独立性或缺乏独立性。

