Posted on Categories:Microeconomics, 微观经济学, 经济代写

# 经济代写|微观经济学代考Microeconomics代写|ECON1001 Models of learning in dynamic situations

## avatest™帮您通过考试

avatest™的各个学科专家已帮了学生顺利通过达上千场考试。我们保证您快速准时完成各时长和类型的考试，包括in class、take home、online、proctor。写手整理各样的资源来或按照您学校的资料教您，创造模拟试题，提供所有的问题例子，以保证您在真实考试中取得的通过率是85%以上。如果您有即将到来的每周、季考、期中或期末考试，我们都能帮助您！

•最快12小时交付

•200+ 英语母语导师

•70分以下全额退款

## 经济代写|微观经济学代考Microeconomics代写|Models of learning in dynamic situations

If, while retaining the hypothesis of a repeated decision problem, we move from a static decision problem to a dynamic one, two types of learning models can be envisaged. Firstly, we can continue to apply the above models of learning while adapting them to a dynamic context. One possibility consists in translating the decision problem, expressed in extensive form, into a normal form by the introduction of strategies of the decision maker and then applying the above methods to the strategies. Thus, the CPR model is applicable to the decision-maker’s strategies when their performances can be observed. Another possibility is to keep the decision problem in an extensive form, but to apply the above methods to each node of the decision tree. Hence, the CPR model is applicable by considering that, for each successive occurrence in the decision process, the utility obtained by the decision-maker is attributed simultaneously to all the actions appearing in the trajectory followed in the decision tree. Secondly, we can draw directly on the classical rules of choice proposed for dynamic decision situations. This is all the more necessary as these choice rules, based on the backward induction procedure, require high capacities for the processing of information (Sutton-Barto, 1998).

A model of learning proposed early in Artificial Intelligence is the “Qlearning model” (Watkins, 1989), which applies to a stochastic decision process. A reinforcement model, it does not presuppose a priori knowledge of the characteristics of the decision process (probabilities and utilities of transition), although such knowledge helps to accelerate the process. This model leads to revision of “expected local utilities” $U_h^i$ each time the decision maker uses the action $i$ in the configuration $h$ (which he does for the $n_h^i$ th time) to find himself in the configuration $k$, obtaining the utility $u_{u k}^i$. The rule of revision is adapted from the Bellman equation and is written:
$$\Delta \mathrm{U}{\mathrm{h}}^{\mathrm{i}}=a\left(n_h^i\right)\left[\delta U_k+u{h k}^i-U_h^i\right]$$
where $a\left(n_h^i\right)$ is a decreasing averaging function (often $\left.a(n)=1 / n\right)$.

## 经济代写|微观经济学代考Microeconomics代写|Associated models

Local strategies, which associate an action $i$ with each configuration $h$, can be generalised in the form of “rules” or “classifiers” (Holland, 1987). In this case, a rule associates an action $Y_i$ (possibly pluridimensional) with a set of configurations $X_h$ following the principle: “if condition $X_h$, then action $Y_i$ “. The condition of the rule groups together the configurations between which the decision-maker makes no distinction, either because of an error in perception on his part or because the action involved does not require any distinction to be made. It can be considered as an operation of categorisation performed by the decision-maker and therefore expresses the degree of granularity with which he apprehends his environment in relation to the action. A rule is activated by the decision-maker if one of the configurations of its condition is actually produced. Of course, several rules may be activated in the same configuration, in which case they find themselves in competition. Moreover, certain rules will be used in a chain to obtain a certain result.

To each rule is attributed a utility or “force” $U_h^i$ which evolves over the passage of time according to an algorithm close to Q-learning, the algorithm of the “chain of bearers”. In each configuration $h$, the admissible rules make “bids” $\mu U_h^i$ and one of them is chosen with a probability dependent on its bid:
$$p_h^i \propto e^{\mu U_h^i}$$
This rule loses its bid, but receives a reward from two sources:

• from the external environment (if the rule acts on the external environment through the action $i$ by providing a utility $u_h^i$
$$\Delta U_h^i=u_h^i-\mu U_h^i$$
• from the internal environment (if the rule acts on the internal environment by causing transition to the state $k$, thus triggering a new rule, of which the action is $j$ and from which it receives the bid):
$$\Delta U_h^i=\mu U_k^j-\mu U_h^i$$

# 微观经济学代写

## 经济代写|微观经济学代考Microeconomics代写|Models of learning in dynamic situations

$$\Delta \mathrm{Uh}^{\mathrm{i}}=a\left(n_h^i\right)\left[\delta U_k+u h k^i-U_h^i\right]$$

## 经济代写|微观经济学代考Microeconomics代写|Associated models

$$p_h^i \propto e^{\mu U \hbar}$$

• 来自外部环境 (如果规则通过操作作用于外部环境 $i$ 通过提供实用程序 $u_h^i$
$$\Delta U_h^i=u_h^i-\mu U_h^i$$
• 来自内部环境 (如果规则通过导致状态转换作用于内部环境 $k$ ，从而触发一个新规则，其动作是 $j$ 并从中 收到投标)：
$$\Delta U_h^i=\mu U_k^j-\mu U_h^i$$

avatest.org 为您提供可靠及专业的论文代写服务以便帮助您完成您学术上的需求，让您重新掌握您的人生。我们将尽力给您提供完美的论文，并且保证质量以及准时交稿。除了承诺的奉献精神，我们的专业写手、研究人员和校对员都经过非常严格的招聘流程。所有写手都必须证明自己的分析和沟通能力以及英文水平，并通过由我们的资深研究人员和校对员组织的面试。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。