In summary, the viewpoint of random graphs indeed provides us an opportunity of improving some existing machine learning algorithms. / In this thesis, we establish three machine learning models on random graphs: Heat Diffusion Models on Random Graphs, Predictive Random Graph Ranking, and Random Graph Dependency. The heat diffusion models on random graphs lead to Graph-based Heat Diffusion Classifiers (G-HDC) and a novel ranking algorithm on Web pages called DiffusionRank. For G-HDC, a random graph is constructed on data points. The generated random graph can be considered as the representation of the underlying geometry, and the heat diffusion model on them can be considered as the approximation to the way that heat flows on a geometric structure. Experiments show that G-HDC can achieve better performance in accuracy in some benchmark datasets. For DiffusionRank, theoretically we show that it is a generalization of PageRank when the heat diffusion coefficient tends to infinity, and empirically we show that it achieves the ability of anti-manipulation. / Predictive Random Graph Ranking (PRGR) incorporates DiffusionRank. PRGR aims to solve the problem that the incomplete information about the Web structure causes inaccurate results of various ranking algorithms. The Web structure is predicted as a random graph, on which ranking algorithms are expected to be improved in accuracy. Experimental results show that the PRGR framework can improve the accuracy of the ranking algorithms such as PageRank and Common Neighbor. / Three special forms of the novel Random Graph Dependency measure on two random graphs are investigated. The first special form can improve the speed of the C4.5 algorithm, and can achieve better results on attribute selection than gamma used in Rough Set Theory. The second special form of the general random graph dependency measure generalizes the conditional entropy because it becomes equivalent to the conditional entropy when the random graphs take their special form-equivalence relations. Experiments demonstrates that the second form is an informative measure, showing its success in decision trees on small sample size problems. The third special form can help to search two parameters in G-HDC faster than the cross-validation method. / Yang, haixuan. / "August 2007." / Advisers: Irwin King; Michael R. Lyu. / Source: Dissertation Abstracts International, Volume: 69-02, Section: B, page: 1125. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2007. / Includes bibliographical references (p. 184-197). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract in English and Chinese. / School code: 1307.
Identifer | oai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_344040 |
Date | January 2007 |
Contributors | Yang, Haixuan., Chinese University of Hong Kong Graduate School. Division of Computer Science and Engineering. |
Source Sets | The Chinese University of Hong Kong |
Language | English, Chinese |
Detected Language | English |
Type | Text, theses |
Format | electronic resource, microform, microfiche, 1 online resource (xiii, 197 p : ill.) |
Rights | Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/) |
Page generated in 0.0019 seconds