11 |
Model-based clustering with network covariates by combining a modified product partition model with hidden Markov random field.January 2012 (has links)
乘積型劃分模型最近被擴展為容許個體有協變量的隨機聚類模型,然而協變量受限與對個體性質的描述。隨著科技發展,於越來越多生物醫學或社會研究的聚類問題中,我們需要考慮聚類對象間兩兩關連的額外資料,如基因間的調節關係或人際關係中的社交網絡。為此我們提出一個基於模型的方法,綜合乘積型劃分模型的一種改型與隱馬可夫隨機場對有網絡和協變量信息的對象做聚類。統計推論以貝葉斯方法進行。模型計算以馬可夫鏈蒙地卡羅運算法則進行。為了使馬可夫鏈能更好地混和,使用循序分配合併分裂取樣器進行群體移動以減少困於區域性頂點的機會。 / 為了測試本文提出的新方法的聚類性能,我們在兩個合成數據集上進行了模擬實驗。該實驗涵括多種類型的應變量,協變量網絡結構。結果顯示該方法在大部分實驗條件下都具有高正確聚類率。我們還將此返法應用於兩個真實數據集。第一個真實數據集利用學術期刊間相互引用的信息幫助對學術期刊的分門別類。第二個真實數據集合併酵母中基因的表達、轉錄因子結合位點和基因間的調控網絡信息,已對基因做詳細的功能分類。這兩個基於真實數據的實驗都給出諸多有意義的結果。 / The product partition model was recently extended for the covariate-dependent random partition of subjects, where the covariates are limited to properties of individual subjects. For many clustering problems in biomedical or social studies, we often have extra clustering information from the pairwise association among subjects, such as the regulatory relationship between genes or the social network among people. Here we propose a model-based method for clustering with network information by combining a modified product partition model with hidden Markov random field. The Bayesian approach is used for statistical inference. Markov Chain Monte Carlo algorithms are used to compute the model. In order to improve the mixing of the chain, the Sequentially-Allocated Merge-Split Sampler is adapted to perform group moves as an eort to lower the chance of trapping in local modes. / The new method is tested on two synthesized data sets to evaluate its performance on different types of response variables, covariates and networks. The correct clustering rate is satisfactory under a wide range of conditions. We also applied this new method on two real data sets. The first real data set is the journal data, where the cross citation information among journals is used to groups journals to different categories. The second real data set involves the gene expression, motif binding and gene network of yeast, where the goal is to find detail gene functional groups. Both experiments yielded interesting results. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Fung, Ling Hiu. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2012. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Technical Background --- p.7 / Chapter 2.1 --- Variable notation --- p.8 / Chapter 2.2 --- Two exemplary models for the response variable --- p.10 / Chapter 2.3 --- PPMx --- p.12 / Chapter 2.3.1 --- PPM - definition and its equivalence to DPM --- p.12 / Chapter 2.3.2 --- PPMx - extension with covariates --- p.15 / Chapter 2.3.3 --- Posterior inference --- p.18 / Chapter 2.4 --- HMRF --- p.19 / Chapter 2.4.1 --- Definition --- p.19 / Chapter 2.4.2 --- Constrained Dirichlet Process Mixture --- p.21 / Chapter 3 --- Model-based Clustering with Network Covariates --- p.27 / Chapter 3.1 --- Design of the model --- p.27 / Chapter 3.2 --- The Bayesian MCNC model --- p.30 / Chapter 3.3 --- MCMC computing --- p.31 / Chapter 3.4 --- Performance evaluation criteria --- p.37 / Chapter 4 --- Simulation study --- p.39 / Chapter 4.1 --- Network --- p.39 / Chapter 4.2 --- Covariates --- p.41 / Chapter 4.3 --- The Phase model (M1) --- p.42 / Chapter 4.4 --- The Normal model (M2) --- p.52 / Chapter 4.5 --- Comparing correct clustering percentage and correct co-occurrence percentage --- p.62 / Chapter 5 --- Real data --- p.68 / Chapter 5.1 --- Journal cross-citation data --- p.68 / Chapter 5.2 --- Gene Network of yeast data --- p.76 / Chapter 6 --- Conclusions --- p.89 / Chapter A --- p.91 / Chapter A.1 --- Covariates --- p.91 / Chapter A.1.1 --- Continuous covariates --- p.91 / Chapter A.1.2 --- Categorical covariates --- p.94 / Chapter A.1.3 --- Count covariates --- p.96 / Chapter A.2 --- Phase model --- p.98 / Chapter A.2.1 --- Prior specification --- p.99 / Chapter A.2.2 --- Data generation --- p.99 / Chapter A.2.3 --- Posterior estimation --- p.100 / Chapter A.3 --- Normal model --- p.111 / Chapter A.3.1 --- Prior specification --- p.111 / Chapter A.3.2 --- Data generation --- p.112 / Chapter A.3.3 --- Posterior estimation --- p.112 / Chapter A.4 --- Journal dataset --- p.115
|
12 |
Hyperplane based efficient clustering and searching /Chan, Alton Kam Fai. January 2003 (has links)
Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2003. / Includes bibliographical references (leaves 55-57). Also available in electronic version. Access restricted to campus users.
|
13 |
Model based and hybrid clustering of large datasets /Tantrum, Jeremy, January 2003 (has links)
Thesis (Ph. D.)--University of Washington, 2003. / Vita. Includes bibliographical references (p. 93-96).
|
14 |
An Alternative Approach to Visualizing Stock Market Correlation Matrices- An Empirical study of forming portfolios that contain only small numbers of stocks using both existing and newly discovered visualization methodsZhan, Cheng Juan January 2014 (has links)
The core of stock portfolio diversification is to pick stocks from different correlation clusters when forming portfolios. The result is that the chosen stocks will be only weakly correlated with each other. However, since correlation matrices are high dimensional, it is close to impossible to determine correlation clusters by simply looking at a correlation matrix. It is therefore common to regard industry groups as correlation clusters. In this thesis, we used three visualization methods namely Hierarchical Cluster Trees, Minimum Spanning Trees and neighbor-Net splits graphs to “collapse” correlation matrices’ high dimensional structures onto two-dimensional planes, and then assign stocks into different clusters to create the correlation clusters. We then simulated sets of portfolios where each set contains 1000 portfolios, and stocks in each of the portfolio were picked from the correlation clusters suggested by each of the three visualization methods and industry groups (another way of determine correlation clusters). The mean and variance distribution of each set of 1000 simulated portfolios gives us an indication of how well those clusters were determined.
The examinations were conducted on two sets of financial data. The first one is the 30 stocks in the Dow Jones Industrial average which contains relatively small number of stocks and the second one is the ASX 200 which contains relatively larger number of stocks. We found none of the methods studied consistently defined correlation clusters more efficiently than others in out-of-sample testing.
The thesis does contribute the finance literature in two ways. Firstly, it introduces the neighbor-Net method as an alternative way to visualize financial data’s underlying structures. Secondly, it used a novel “visualization
|
15 |
Cluster housing with particular reference to South Australia.Featherstone, Julia Lesley. January 1979 (has links) (PDF)
Thesis (M.U.R.P. 1979) from the Department of Architecture, University of Adelaide.
|
16 |
Semi-automated mapping for the reflexion methodChristl, Andreas, January 2005 (has links)
Stuttgart, Univ., Diplomarbeit, 2005.
|
17 |
Optimierte Implementierung ausgewählter kollektiver Operationen unter Ausnutzung der Hardwareparallelität des InfiniBand NetzwerkesFranke, Maik. Höfler, Torsten. January 2007 (has links)
Chemnitz, Techn. Univ., Diplomarb., [2007].
|
18 |
Probabilistic model-based clustering of complex dataZhong, Shi, January 2003 (has links) (PDF)
Thesis (Ph. D.)--University of Texas at Austin, 2003. / Vita. Includes bibliographical references. Available also from UMI Company.
|
19 |
Three dimensional drawings of pain location in cluster headache.Fraser, Ruth Ann, Carleton University. Dissertation. Psychology. January 1992 (has links)
Thesis (M.A.)--Carleton University, 1992. / Also available in electronic format on the Internet.
|
20 |
Energieoptimierung von Clustern und spinodale Entmischung in FluidenKabrede, Hendrik. January 2004 (has links)
Wuppertal, Univ., Diss., 2004. / Computerdatei im Fernzugriff.
|
Page generated in 0.0617 seconds