Posted on Categories:CS代写, Machine Learning, 机器学习, 计算机代写

## avatest™帮您通过考试

avatest™的各个学科专家已帮了学生顺利通过达上千场考试。我们保证您快速准时完成各时长和类型的考试，包括in class、take home、online、proctor。写手整理各样的资源来或按照您学校的资料教您，创造模拟试题，提供所有的问题例子，以保证您在真实考试中取得的通过率是85%以上。如果您有即将到来的每周、季考、期中或期末考试，我们都能帮助您！

•最快12小时交付

•200+ 英语母语导师

•70分以下全额退款

## 计算机代写|机器学习代写Machine Learning代考|Hyperparameters

There are often implicit parameters in our model that we hold fixed, such as the covariance constants in linear regression, or the parameters that govern the prior distribution over the weights. These are usually called “hyperparameters.” For example, in the RBF model, the hyperparameters constitute the parameters $\alpha, \sigma^2$, and the parameters of the basis functions (e.g., the width of the basis functions). Thus far we have assumed that the hyperparameters were “known” (which means that someone must set them by hand), or estimated by cross-validation (which has a number of pitfalls, including long computation times, especially for large numbers of hyperparameters). Instead of either of these approaches, we may apply the Bayesian approach in order to directly estimate these values as well.

To find a MAP estimate for the $\alpha$ parameter in the above linear regression example we compute:
$$\alpha^*=\arg \max \ln p\left(\alpha \mid x_{1: N}, y_{1: N}\right)$$
Where
$$p\left(\alpha \mid x_{1: N}, y_{1: N}\right)=\frac{p\left(y_{1: N} \mid x_{1: N}, \alpha\right) p(\alpha)}{p\left(y_{1: N} \mid x_{1: N}\right)}$$
and
\begin{aligned} p\left(y_{1: N} \mid x_{1: N}, \alpha\right) & =\int p\left(y_{1: N}, \mathbf{w} \mid x_{1: N}, \alpha\right) d \mathbf{w} \ & =\int p\left(y_{1: N} \mid x_{1: N}, \mathbf{w}, \alpha\right) p(\mathbf{w} \mid \alpha) d \mathbf{w} \ & =\int\left(\prod_i p\left(y_i \mid x_i, \mathbf{w}, \alpha\right)\right) p(\mathbf{w} \mid \alpha) d \mathbf{w} \end{aligned}

## 计算机代写|机器学习代写Machine Learning代考|Bayesian Model Selection

How do we choose which model to use? For example, we might like to automatically choose the form of the basis functions or the number of basis functions. Cross-validation is one approach, but it can be expensive, and, more importantly, inaccurate if small amounts of data are available. In general one intuition is that we want to choose simple models over complex models to avoid overfitting,insofar as they provide equivalent fits to the data. Below we consider a Bayesian approach to model selection which provides just such a bias to simple models.

The goal of model selection is to choose the best model from some set of candidate models $\left{\mathcal{M}i\right}{i=1}^L$ based on some observed data $\mathcal{D}$. This may be done either with a maximum likelihood approach (picking the model that assigns the largest likelihood to the data) or a MAP approach (picking the model with the highest posterior probability). If we take a uniform prior over models (i.e. $p\left(\mathcal{M}_i\right)$ is a constant for all $\left.i=1 \ldots L\right)$ then these approaches can be seen to be equivalent since:
\begin{aligned} p\left(\mathcal{M}_i \mid \mathcal{D}\right) & =\frac{p\left(\mathcal{D} \mid \mathcal{M}_i\right) p\left(\mathcal{M}_i\right)}{p(\mathcal{D})} \ & \propto p\left(\mathcal{D} \mid \mathcal{M}_i\right) \end{aligned}
In practice a uniform prior over models may not be appropriate, but the design of suitable priors in these cases will depend significantly on one’s knowledge of the application domain. So here we will assume a uniform prior over models and focus on $p\left(\mathcal{D} \mid \mathcal{M}_i\right)$.

## 计算机代写|机器学习代写Machine Learning代考|Hyperparameters

$$\alpha^*=\arg \max \ln p\left(\alpha \mid x_{1: N}, y_{1: N}\right)$$

$$p\left(\alpha \mid x_{1: N}, y_{1: N}\right)=\frac{p\left(y_{1: N} \mid x_{1: N}, \alpha\right) p(\alpha)}{p\left(y_{1: N} \mid x_{1: N}\right)}$$

$$p\left(y_{1: N} \mid x_{1: N}, \alpha\right)=\int p\left(y_{1: N}, \mathbf{w} \mid x_{1: N}, \alpha\right) d \mathbf{w} \quad=\int p\left(y_{1: N} \mid x_{1: N}, \mathbf{w}, \alpha\right) p(\mathbf{w} \mid \alpha) d \mathbf{w}=\int\left(\prod_i p\left(y_i \mid x_i, \mathbf{w}, \alpha\right)\right) p(\mathbf{w} \mid \alpha) d \mathbf{w}$$

## 计算机代写|机器学习代写Machine Learning代考|Bayesian Model Selection

$$p\left(\mathcal{M}_i \mid \mathcal{D}\right)=\frac{p\left(\mathcal{D} \mid \mathcal{M}_i\right) p\left(\mathcal{M}_i\right)}{p(\mathcal{D})} \quad \propto p\left(\mathcal{D} \mid \mathcal{M}_i\right)$$

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

Posted on Categories:CS代写, Machine Learning, 机器学习, 计算机代写

## avatest™帮您通过考试

avatest™的各个学科专家已帮了学生顺利通过达上千场考试。我们保证您快速准时完成各时长和类型的考试，包括in class、take home、online、proctor。写手整理各样的资源来或按照您学校的资料教您，创造模拟试题，提供所有的问题例子，以保证您在真实考试中取得的通过率是85%以上。如果您有即将到来的每周、季考、期中或期末考试，我们都能帮助您！

•最快12小时交付

•200+ 英语母语导师

•70分以下全额退款

## 计算机代写|机器学习代写Machine Learning代考|Generative vs. Discriminative models

The classifiers described here illustrate a distinction between two general types of models in machine learning:

1. Generative models, such as the GCC, describe the complete probability of the data $p(\mathbf{x}, y)$.
2. Discriminative models, such as LR, ANNs, and KNN, describe the conditional probability of the output given the input: $p(y \mid \mathbf{x})$

The same distinction occurs in regression and classification, e.g., KNN is a discriminative method that can be used for either classification or regression.

The distinction is clearest when comparing LR with GCC with equal covariances, since they are both linear classifiers, but the training algorithms are different. This is because they have different goals; LR is optimized for classification performance, where as the GCC is a “complete” model of the probability of the data that is then pressed into service for classification. As a consequence, GCC may perform poorly with non-Gaussian data. Conversely, LR is not premised on any particular form of distribution for the two class distributions. On the other hand, LR can only be used for classification, whereas the GCC can be used for other tasks, e.g., to sample new $\mathrm{x}$ data, to classify noisy inputs or inputs with outliers, and so on.

The distinctions between generative and discriminative models become more significant in more complex problems. Generative models allow us to put more prior knowledge into how we build the model, but classification may often involve difficult optimization of $p(y \mid \mathbf{x})$; discriminative methods are typically more efficient and generic, but are harder to specialize to particular problems.

## 计算机代写|机器学习代写Machine Learning代考|Classification by LS Regression

One tempting way to perform classification is with least-squares rgression. That is, we could treat the class labels $y \in{-1,1}$ as real numbers, and estimate the weights by minimizing
$$E(\mathbf{w})=\sum_i\left(y_i-\mathbf{x}_i^T \mathbf{w}\right)^2$$
for labeled training data $\left{\mathrm{x}_i, y_i\right}$. Given the optimal regression weights, one could then perform regression on subsequent test inputs and use the sign of the output to determine the output class.
In simple cases this can perform well, but in general it will perform poorly. This is because the objective function in linear regression measures the distance from the modeled class labels (which can be any real number) to the true class labels, which may not provide an accurate measure of how well the model has classified the data. For example, a linear regression model will tend to produce predicted labels that lie outside the range of the class labels for “extreme” members of a given class (e.g. 5 when the class label is 1 ), causing the error to be measured as high even when the classification (given, say, by the sign of the predicted label) is correct. In such a case the decision boundary may be shifted towards such an extreme case, potentially reducing the number of correct classifications made by the model. Figure 13 demonstrates this with a simple example.

The problem arises from the fact that the constraint that $y \in(-1,1)$ is not built-in to the model (the regression algorithm knows nothing about it), and so wastes considerable representational power trying to reproduce this effect. It is much better to build this constraint into the model.

## 计算机代写|机器学习代写Machine Learning代 考|Generative vs.Discriminative models

1. 生成模型，例如 $\mathrm{GCC}$ ，描述了数据的完整概率 $p(\mathbf{x}, y)$.
2. 判别模型，例如 $L R 、 A N N$ 和 KNN，描述了给定输入时输出的条件概率: $p(y \mid \mathbf{x})$
同样的区别出现在回归和分类中，例如，KNN 是一种可用于分类或回归的判别方法。
当比较具有相等协方差的 $L R$ 和 GCC 时，区别最为明显，因为它们都是线性分类器， 但训练算法不同。这是因为他们有不同的目标; LR 针对分类性能进行了优化，而 GCC 是数据概率的“完整”模型，然后将其压入服务以进行分类。因此，GCC 在处理非高斯 数据时可能表现不佳。相反，LR 不以二类分布的任何特定形式的分布为前提。另一方 面， $L R$ 只能用于分类，而 GCC 可以用于其他任务，例如，对新的样本进行采样x数 据，对噪声输入或具有异常值的输入进行分类，等等。
在更复杂的问题中，生成模型和判别模型之间的区别变得更加重要。生成模型允许我们 将更多的先验知识用于我们如何构建模型，但分类通常可能涉及困难的优化 $p(y \mid \mathbf{x})$ ； 判别方法通常更有效和通用，但更难专门针对特定问题。

## 计算机代写|机器学习代写Machine Learning代考|Classification by LS Regression

$$E(\mathbf{w})=\sum_i\left(y_i-\mathbf{x}_i^T \mathbf{w}\right)^2$$

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。