## 数学代写|金融数学代写Financial Mathematics代考|Analytics from Machine Learning Literature

The technical trading rules that are to be discussed in Chapter 6 involve a relatively small information set to carry out predictions. It treats each of the ‘ $n$ ‘ time series of asset returns as autonomous. As pointed out by Malkiel (2012) [258], these rules can be easily implemented by most market participants if such opportunities should emerge, and market efficiency would rule them out as winning strategies. More powerful prediction methods using a very large set of potential predictors and yet capable of avoiding overfitting are needed to take advantage of transient opportunities. We gave an overview of recent advances in high-dimensional regression in Chapter $4 .$ The classification techniques can also be formulated with a high-dimensional feature vector. In machine learning and in computer science the focus is to handle rapidly different types of data (including text). The literature in this field is vast and so richly detailed that a single chapter wouldn’t do justice to the topic. Consequently, here we will only briefly mention tools that are deemed to be relevant to trading and invite interested readers to further their knowledge with dedicated references such as Goodfellow, Bengio, and Courville (2016) [168] and Lopez de Prado (2018) [252].

Machine learning is a still-growing area of computer science that encompasses other well-established areas such as statistics, computational algorithms, control theory, etc. The focus in machine learning is on developing efficient algorithms for prediction or for classification using large data sets. The efficiency is gauged by predictive validation accuracy. The inferential aspects of statistical theory, such as standard error of the estimates, confidence intervals, etc., are generally not of much concern. Some areas where machine learning methods have led to significant contribution are classification, clustering and multi-dimensional regression. The attractiveness of these methods lies in the fact that they do not need any a priori theory to suggest which relevant variables to consider. Therefore with no prescription of variables, the variable or feature selection becomes an important process in machine learning. The statistical foundations of machine learning methods are also alternatively referred to as statistical learning methods. We provide a brief description of a select few methods in this section.

## 数学代写|金融数学代写Financial Mathematics代考|Neural Networks

This is one of the main methods of machine learning where the performance of a task is learned by analyzing training examples that have been hand-labeled in advance. The neural network in its simplest form can be represented as follows: Here, the square boxes contain ‘ $n$ ‘ dimensional input or feature vector, $X$ and ‘ $m$ ‘ dimensional output vector, $Y$ and the circular box indicates hidden or unknown layer in-between that connects the input to the output. Structurally, neural networks are a two-stage regression or classification model with intermediary layers, and conceptually the model is similar to the reduced-rank regression model 3.16. But, $Z$ is generally a non-linear function of $X$. In the ordinary least squares regression model, the goal is to get ‘ $m$ ‘ linear combinations of ‘ $X$ ‘ that best predicts ‘ $Y$ ‘. Here ‘ $r$ ‘ dimensional intermediaries can vary based on the assumption of ‘hidden’ layers, but the ‘key’ is that the output vector, ‘ $Y$ ‘, can be a non-linear function of ‘ $X$ ‘ via the hidden layers and ‘ $r$ ‘ can be larger than ‘ $n$ ‘. In its simplest form the neural network model can be written as follows:
\begin{aligned} &Z_{i}=\sigma\left(\beta_{i}^{\prime} X\right), \quad i=1,2, \ldots, r \ &Y_{j}=g_{j}(Z), \quad j=1,2, \ldots, m, \end{aligned}
where the function $\sigma(u)=\frac{1}{1+e^{-u}}$ is the sigmoid function. Note that this is the function used in logistic regression. In the regression set-up, $g_{j}(Z)=\alpha_{j}^{\prime} Z$, but in the $m$-class classification,
$$g_{j}(Z)=\frac{e^{\alpha_{j}^{\prime} Z}}{\sum_{l=1}^{m} e^{\alpha_{j}^{\prime} Z}}$$
called the softmax function, is used. In the set-up given in (4.56), we assume only one hidden layer, but in practical applications many layers are assumed which leads to non-uniqueness problems. As shown in Hastie, Tibshirani and Friedman (2009) [184] the neural network problem is closely related to a non-parametric method called projection pursuit regression. In its simplest form where the information flows in only one direction, it is called a feed-forward neural net, which is commonly used in many applications.

