Posted on Categories:Business Statistics, 商业统计, 商科代写

If you had to put a number (say, between 0 and 1 ) on the strength of the linear association between house prices and sizes in Figure 4.1, what would it be? Your measure shouldn’t depend on the choice of units for the variables. Zillow could have reported the house sizes in square meters and the price in thousands of dollars, but regardless of the units, the scatterplot would look the same. When we change units, the direction, form, and strength won’t change, so neither should our measure of the association’s (linear) strength.

We saw a way to remove the units in the previous chapter. We can standardize each of the variables, finding $z_{x}=\left(\frac{x-\bar{x}}{s_{x}}\right)$ and $z_{y}=\left(\frac{y-\bar{y}}{s_{y}}\right)$. With these, we can compute a measure of strength that you’ve probably heard of-the correlation coefficient:
$$r=\frac{\sum z_{x} z_{y}}{n-1}$$
Keep in mind that the $x$ ‘s and $y$ ‘s are paired. For each house we have a price and a living area. To find the correlation we multiply each standardized value by the standardized value it is paired with and add up those cross products. We divide the total by the number of pairs minus one, $n-1.2$

There are alternative formulas for the correlation in terms of the variables $x$ and $y$. Here are two of the more common:
$$r=\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2} \sum(y-\bar{y})^{2}}}=\frac{\sum(x-\bar{x})(y-\bar{y})}{(n-1) s_{x} s_{y}}$$

Correlation measures the strength of the linear association between two quantitative variables. Before you use correlation, you must check three conditions:

• Quantitative Variables Condition: Correlation applies only to quantitative variables. Don’t apply correlation to categorical data masquerading as quantitative. Check that you know the variables’ units and what they measure.
• Linearity Condition: Sure, you can calculate a correlation coefficient for any pair of variables. But correlation measures the strength only of the linear association and will be misleading if the relationship is not straight enough. What is “straight enough”? This question may sound too informal for a statistical condition, but that’s really the point. We can’t verify whether a relationship is linear or not. Very few relationships between variables are perfectly linear, even in theory, and scatterplots of real data are never perfectly straight. How nonlinear looking would the scatterplot have to be to fail the condition? This is a judgment call that you just have to think about. Do you think that the underlying relationship is curved? If so, then summarizing its strength with a correlation would be misleading.
• Outlier Condition: Unusual observations can distort the correlation and can make an otherwise small correlation look big or, on the other hand, hide a large correlation. It can even give an otherwise positive association a negative correlation coefficient (and vice versa). When you see an outlier, it’s often a good idea to report the correlation both with and without the point.

Each of these conditions is easy to check with a scatterplot. Many correlations are reported without supporting data or plots. You should still think about the conditions. You should be cautious in interpreting (or accepting others’ interpretations of) the correlation when you can’t check the conditions for yourself.

Throughout this course, you’ll see that doing statistics right means selecting the proper methods. That means you have to think about the situation at hand. An important first step is to check that the type of analysis you plan is appropriate. These conditions are just the first of many such checks.

## 商业统计代写

$$r=\frac{\sum z_{x} z_{y}}{n-1}$$

$$r=\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2} \sum(y-\bar{y})^{2}}}=\frac{\sum(x-\bar{x})(y-\bar{y})}{(n-1) s_{x} s_{y}}$$

• 定量变量条件：相关性仅适用于定量变量。不要将相关性应用于伪装成定量的分类数 据。检龺您是否知道变量的单位以及它们测量的内容。
• 线性条件：当然，您可以计算任何一对变量的相关系数。但相关性仅衡量线性关联的 强度，如果关系不够直，则会产生误导。什么是“够直”? 这个问题对于统计条件来说可 能听起来太不正式，但这确实是重点。我们无法验证关系是否是线性的。即使在理论 上，变量之间的关系也很少是完全线性的，并且真实数据的散点图从来都不是完全䇻 直的。散点图的非线性看起来有多非线性才能使条件失败? 这是一个你只需要考虑的 判断电话。你认为潜在的关系是弯曲的吗? 如果是这样，那么用相关性来总结其强度 将是误导性的。
• 异常值条件：不寻常的观察会扭曲相关性，并使原本很小的相关性看起来很大，或者 另一方面，隐藏大的相关性。它甚至可以给一个正相关的负相关系数（反之亦然）。
当您看到异常值时，报告有无该点的相关性通常是一个好主意。
这些条件中的每一个都可以通过散点图轻松检龺。许多相关性在没有支持数据或图表的情 况下被报告。您仍然应该考虑条件。当你无法自己检查条件时，您应该谨慎解释（或接受 他人的解释) 相关性。
在本课程中，您将看到正确进行统计意味着选择正确的方法。这意味着你必须考虑手头的 情况。重要的第一步是检龺您计划的分析类型是否合适。这些条件只是许多此类检柦中的 第一个。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

Posted on Categories:Business Statistics, 商业统计, 商科代写

## 商科代写|商业统计代考BUSINESS STATISTICS代考|ECON225 Looking at Scatterplots

The Texas Transportation Institute (TTI), founded in 1950 , studies and develops solutions to challenges faced by all forms of transportation. Although the goal is to solve the transportation problems of 2025 , the TTI started collecting data on the relationship between freeway speed and the cost to society of freeway congestion as early as the year 2000 . Figure $4.2$ shows a scatterplot of the annual Congestion Cost Per Person of traffic delays (in dollars) in 65 cities in the United States against the Peak Period Freeway Speed (mph).

If you want to describe the scatterplot of Congestion Cost against Freeway Speed, you might first mention the direction of the association. As the peak freeway speed goes up, the cost of congestion goes down. A pattern that runs from the upper left to the lower right $\because$ : is said to be negative. A pattern running the other
way $\therefore^{\circ}$, as we saw for the price and size of houses, is called positive.
The second thing to look for in a scatterplot is its form. If there is a straight line relationship, it will appear as a cloud or swarm of points stretched out in a generally consistent, straight form. For example, the scatterplot of house prices (Figure 4.1) has an underlying linear form, although some points stray away from it.
Scatterplots can reveal many different kinds of patterns. Often they will not be straight, but straight line patterns are both the most common and the most useful for statistics.

If the relationship isn’t straight, but curves gently, while still increasing or we can often find ways to straighten it out by re-expressing one (or both) of the variables. But if it curves sharply-up and then down, for example,—then you’ll need more advanced methods.The third feature to look for in a scatterplot is the strength of the relationship.

## 商科代写|商业统计代考BUSINESS STATISTICS代考|Assigning Roles to Variables in Scatterplots

To make a scatterplot of two quantitative variables, assign one to the $y$-axis and the other to the $x$-axis. ${ }^{1}$ As with any graph, be sure to label the axes clearly, and indicate the scales of the axes with numbers. Scatterplots display quantitative variables. Each variable has units, and these should appear with the display-usually near each axis. Each point is placed on a scatterplot at a position that corresponds to values of these two variables. Its horizontal location is specified by its $x$-value, and its vertical location is specified by its $y$-value variable. Together, these are known as coordinates and written $(x, y)$.

Scatterplots made by computer programs (such as the two we’ve seen in this chapter) often do not-and usually should not-show the origin, the point at $x=0, y=0$ where the axes meet. If both variables have values near or on both sides of zero, then the origin will be part of the display. If the values are far from zero, though, there’s no reason to include the origin. In fact, it’s far better to focus on the part of the Cartesian plane that contains the data. In our example about house prices, none of the houses were free and all had some area so the computer drew the scatterplot in Figure $4.1$ with axes that don’t quite meet.

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

Posted on Categories:Business Statistics, 商业统计, 商科代写

A histogram can provide information about the distribution of a variable, but it can’t show any pattern over time. A time series variable is a quantitative variable that has been measured or recorded at regular intervals over time. Usually, we require that the intervals are equally spaced, although values recorded for successive business days (skipping weekends and holidays) are usually treated as equally spaced. Whenever we have time series data, it is a good idea to look for patterns by plotting the data in time order.

When a time series has no strong trend or change in variability we say that it is stationary. ${ }^{9}$ A histogram can provide a useful summary of a stationary series but generally misses what is really going on in a variable that changes over time.

When a distribution is skewed, it may not be appropriate to summarize the data simply with a center and spread, and it can be hard to decide whether the most extreme values are outliers or just part of the stretched-out tail. How can we say anything useful about such data? The secret is to apply a simple function to each data value. One function that can change the shape of a distribution is the logarithm function. Let’s examine an example in which a set of data is severely skewed.

In 1980 , the average CEO made about 42 times the average worker’s salary. In the two decades that followed, CEO compensation soared when compared with the average worker’s pay. What does the distribution of a sample of 434 companies’ CEOs look like? Figure $3.16$ shows a histogram of the CEO compensation from a recent year.

These values are reported in millions of dollars. The boxplot indicates that some of the CEOs received extraordinarily high compensation. The reason that the histogram seems to leave so much of the area blank is that the largest observations are so far from the bulk of the data. This distribution is very skewed to the right.

## 商业统计代写

1980 年，CEO 的平均工资约为工人平均工资的 42 倍。在随后的二十年中，CEO 的薪酬与普通员工的薪酬相比飙升。434 家公司 CEO 样本的分布情况如何？数字3.16显示了最近一年 CEO 薪酬的直方图。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。