## CS代写|机器学习代写Machine Learning代考|Neuron Model

Research on neural networks started quite a long time ago, and it has become a broad and interdisciplinary research field today. Though neural networks have various definitions across disciplines, this book uses a widely adopted one: “Artificial neural networks are massively parallel interconnected networks of simple (usually adaptive) elements and their hierarchical organizations which are intended to interact with the objects of the real world in the same way as biological nervous systems do” (Kohonen 1988). In the context of machine learning, neural networks refer to “neural networks learning”, or in other words, the intersection of machine learning research and neural networks research.

The basic element of neural networks is neuron, which is the “simple element” in the above definition. In biological neural networks, the neurons, when “excited”, send neurotransmitters to interconnected neurons to change their electric potentials. When the electric potential exceeds a threshold, the neuron is activated (i.e., “excited”), and it will send neurotransmitters to other neurons.

In 1943, (McCulloch and Pitts 1943) abstracted the above process into a simple model called the McCulloch-Pitts model (M-P neuron model), which is still in use today. As illustrated in – Figure 5.1, each neuron in the M-P neuron model receives input signals from $n$ neurons via weighted connections. The weighted sum of received signals is compared against the threshold, and the output signal is produced by the activation function.

The ideal activation function is the step function illustrated in – Figure 5.2a, which maps the input value to the output value ” 0 ” (non-excited) or ” 1 ” (excited). Since the step function has some undesired properties such as being discontinuous and non-smooth, we often use the sigmoid function instead. – Figure 5.2b illustrates a typical sigmoid function that squashes the input values from a large interval into the open unit interval $(0,1)$, and hence also is known as the squashing function.

## CS代写|机器学习代写Machine Learning代考|Perceptron and Multi-layer Network

Perceptron is a binary classifier consisting of two layers of neurons, as illustrated in $-$ Figure 5.3. The input layer receives external signals and transmits them to the output layer, which is an M-P neuron, also known as threshold logic unit.

Perceptron can easily implement the logic operations “AND”, “OR”, and “NOT”. Suppose the function $f$ in $y=$ $f\left(\sum_i w_i x_i-\theta\right)$ is the step function shown in $\bullet$ Figure $5.2$, the logic operations can be implemented as follows:

• “AND” $\left(x_1 \wedge x_2\right)$ : letting $w_1=w_2=1, \theta=2$, then $y=$ $f\left(1 \cdot x_1+1 \cdot x_2-2\right)$, and $y=1$ if and only if $x_1=x_2=1$;
• “OR” $\left(x_1 \vee x_2\right)$ : letting $w_1=w_2=1, \theta=0.5$, then $y=$ $f\left(1 \cdot x_1+1 \cdot x_2-0.5\right)$, and $y=1$ when $x_1=1$ or $x_2=1$;
• “NOT” $\left(\neg x_1\right)$ : letting $w_1=-0.6, w_2=0, \theta=-0.5$, then $y=f\left(-0.6 \cdot x_1+0 \cdot x_2+0.5\right)$, and $y=0$ when $x_1=1$ and $y=1$ when $x_1=0$.

More generally, the weight $w_i(i=1,2, \ldots, n)$ and threshold $\theta$ can be learned from training data. If we consider the threshold $\theta$ as a dummy node with the connection weight $w_{n+1}$ and fixed input $-1.0$, then the weight and threshold are unified as weight learning. The learning of perceptron is simple: for training sample $(\boldsymbol{x}, y)$, if the perceptron outputs $\hat{y}$, then the weight is updated by
$$w_i \leftarrow w_i+\Delta w_i,$$
$$\Delta w_i=\eta(y-\hat{y}) x_i,$$
where $\eta \in(0,1)$ is known as the learning rate. From (5.1) we can see that the perceptron remains unchanged if it correctly predicts the sample $(\boldsymbol{x}, y)$ (i.e., $\hat{y}=y)$. Otherwise, the weight is updated based on the degree of error.

