Posted on Categories:Complex Network, 复杂网络, 数据科学代写, 统计代写, 统计代考

# 数据科学代写|复杂网络代写Complex Network代考|PCS810 Finding the root of a growing tree

avatest™

## avatest™帮您通过考试

avatest™的各个学科专家已帮了学生顺利通过达上千场考试。我们保证您快速准时完成各时长和类型的考试，包括in class、take home、online、proctor。写手整理各样的资源来或按照您学校的资料教您，创造模拟试题，提供所有的问题例子，以保证您在真实考试中取得的通过率是85%以上。如果您有即将到来的每周、季考、期中或期末考试，我们都能帮助您！

•最快12小时交付

•200+ 英语母语导师

•70分以下全额退款

## 数据科学代写|复杂网络代写Complex Network代考|Finding the root of a growing tree

Inference is generally about drawing conclusions about the whole on the basis of a sample. Statistical inference is ‘the process of deducing properties of an underlying distribution by analysis of data’ (Zdeborová and Krzakala, 2016). More specifically, in statistical mechanics inference usually implies concluding characteristics of a statistical ensemble (or its model, which is practically the same) on the basis of a sample (Clauset, Moore, and Newman, 2006). Here we touch upon a more restricted problem. Consider a branching process taking place on a given graph, which started from some unknown initial vertex, a root. At some instant, an observer makes a snapshot of this process and records its result – a tree subgraph of the substrate graph. The questions are: is it possible to guess the root from this observation; and, when it is possible, what is the best root-finding algorithm? The answers to these questions depend on the branching process and on the substrate graph. Remarkably, root finding is possible for a wide range of branching processes and substrate graphs.

Shah and Zaman (2011) proposed the maximum likelihood estimate of the source for what they called rumour spreading on tree or locally tree-like networks. In fact, by rumour spreading they meant the SI model process, where only the order in which vertices become infected turns out to be significant. When the degrees of vertices of a regular locally tree-like substrate are sufficiently large, this process can be naturally substituted with a recursive growing tree without any substrate, and the problem is reformulated as finding the root of a recursive tree, generated by some model, for example, the random recursive tree, a preferential attachment recursive tree, etc. For the problems of this sort, Shah and Zaman showed that their source estimator, the rumour centrality of a vertex in a resulting tree is effective in a wide range of situations, allowing us to find the source with finite probability even in infinite trees. The rumour centrality of vertex $i$ in a tree $\mathcal{T}$ of $N$ vertices is defined in the following way In particular, the rumour centrality of the top vertex, which is the most probable root, is $R=\frac{9 \text { ! }}{9 \times 4 \times 4 \times 3 \times 1 \times 1 \times 1 \times 1 \times 1}=840$. Notice that the vertex with the largest rumour centrality $R$ has the smallest $M$.
$$R_i \equiv \frac{N !}{\prod_{j \in \mathcal{T}} N_j(i)}$$
where $N_j(i)$ is the size of the subtree of the tree $\mathcal{T}$, rooted at $j$ and pointing away from $i$. In particular, $N_i(i)=N$ (Figure 13.1). For a given labelled recursive tree $\mathcal{T}$ with a root at vertex $i, R_i$ equals exactly the total number of possible orders of attaching the vertices. One can represent a recursive tree by a string of labels of vertices according to the order of attachment, where the first entry is $i$. Each of these strings is a particular history of a tree started from vertex $i$. Then the rumour centrality $R_i$ gives the number of these strings-histories, which explains the meaning of this metric. The vertex with the largest rumour centrality is supposed to be the source. The fraction $p_i=R_i / \sum_{j=1}^N R_j$ is the proportion of histories started with vertex $i$ among all histories resulting in the observed tree $\mathcal{T}^1$

In many real-world situations, measurements do not provide accurate or complete information about all vertices and edges of a network. Employed data sets contain errors and omissions. The fundamental problem is how to estimate network structure from available data, that is, to reconstruct a network (Newman, 2018b, 2018a). We touch upon here a special case of this problem. Let the available information about a simple graph of $N$ vertices be incomplete, namely only $E$ of its edges are known certainly. The straightforward way to learn the full structure of this network is to perform $N(N-1) / 2-E$ additional measurements – checks – for the remaining pairs of vertices providing all missing edges. One can however apply a far more efficient approach if the available information is sufficient to guess how this network is organized, in other words, to infer a model fitting the measured structure of a network reasonably well. By using this model, one can obtain the probabilities of connection for the $N(N-1) / 2-E$ remaining pairs of vertices and restrict the additional measurements only to vertex pairs for which this probability is high, exceeding a specified threshold. The number of such pairs is typically small, which ensures the efficacy of the approach.

A somewhat related link-prediction problem for evolving, in particular social, networks was formulated by Liben-Nowell and Kleinberg (2007): ‘Given a snapshot of a social network at time $t$, we seek to accurately predict the edges that will be added to the network during the interval from time $t$ to a given future time $t^{\prime}$.’ In this problem, the model of an evolving network should be inferred from its snapshot, allowing one to find the probabilities of the connections in the near future. Sometimes, however, it appears to be sufficient to know that a network belongs to some class, for example, to social networks, and to use empirical observations collected for these networks. Section 5.3 mentioned Newman’s (2001a) observation that the probability of emergence of an edge between two vertices in a collaboration network increases with the number of their common nearest neighbours. Liben-Nowell and Kleinberg used this number as an edge predictor for ranking the potential future connections and found that it works well for the studied networks of collaborations from the arXiv.

## 数据科学代写|复杂网络代写Complex Network代考|Finding the root of a growing tree

(Zdeborová 和 Krzakala，2016 年) 。更具体地说，在统计力学中，推断通常意味着基于样本得

Shah 和Zaman（2011 年）提出了他们所胃的谣言在树或局部树状网络上传播的来源的最大似然估 计。事实上，他们所说的谣言传播是指 $\mathrm{SI}$ 模型过程，其中只有顶点被感染的顺序被证明是重要的。

$R=\frac{9 !}{9 \times 4 \times 4 \times 3 \times 1 \times 1 \times 1 \times 1 \times 1}=840$. 注意谣言中心性最大的顶点 $R$ 有最小的 $M$.
$$R_i \equiv \frac{N !}{\prod_{j \in \mathcal{T}} N_j(i)}$$

2018a）。我们在这里讨论这个问题的一个特例。让有关简单图形的可用信息 $N$ 顶点是不完整的，即 只有 $E$ 它的边缘当然是众所周知的。学习这个网络的完整结构的直接方法是执行 $N(N-1) / 2-E$ 额外的测量一一检查一一剩余的顶点对提供所有缺失的边。然而，如果可用信息足以猜测该网络的组 织方式，换句话说，可以推断出一个模型相当好地拟合网络的测量结构，则可以应用一种更有效的方 法。通过使用该模型，可以获得连接概率 $N(N-1) / 2-E$ 剩余的顶点对，并将额外的测量限制 在这种概率很高的顶点对上，超过指定的阈值。这种对的数量通常很少，这确保了该方法的有效性。

Liben-Nowell 和 Kleinberg（2007 年）制定了一个与进化网络（尤其是社交网络）有点相关的链接预测问题：“给定一个社交网络的快照吨，我们力求准确预测将在时间间隔内添加到网络的边缘吨到给定的未来时间吨′.’ 在这个问题中，进化网络的模型应该从它的快照中推断出来，从而允许人们在不久的将来找到连接的概率。然而，有时候，知道网络属于某个类别（例如社交网络）并使用为这些网络收集的经验观察似乎就足够了。5.3 节提到了 Newman (2001a) 的观察，即协作网络中两个顶点之间出现边的概率随着它们共同最近邻的数量而增加。Liben-Nowell 和 Kleinberg 使用这个数字作为边缘预测器来对潜在的未来连接进行排名，并发现它适用于 arXiv 研究的合作网络。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。