Global ETD Search

231	Learnable similarity functions and their application to record linkage and clustering Bilenko, Mikhail Yuryevich 28 August 2008 (has links) Not available / text Cluster analysis--Data processing Pattern recognition systems Machine learning
232	Mining complex databases using the EM algorithm Ordońẽz, Carlos January 2000 (has links) No description available. Expectation-maximization algorithms Cluster analysis Data processing Data mining
233	HARP: a practical projected clustering algorithm for mining gene expression data Yip, Yuk-Lap, Kevin., 葉旭立. January 2003 (has links) published_or_final_version / abstract / toc / Computer Science and Information Systems / Master / Master of Philosophy Gene expression - Data processing. Computer algorithms. Cluster analysis. Data mining.
234	Ab initio calculations of silicon clusters 黃新祥, Wong, Sun-cheung. January 1998 (has links) published_or_final_version / Physics / Master / Master of Philosophy Electronic structure. Cluster analysis.
235	Mixtures of Skew-t Factor Analyzers Murray, Paula 11 1900 (has links) Model-based clustering allows for the identification of subgroups in a data set through the use of finite mixture models. When applied to high-dimensional microarray data, we can discover groups of genes characterized by their gene expression profiles. In this thesis, a mixture of skew-t factor analyzers is introduced for the clustering of high-dimensional data. Notably, we make use of a version of the skew-t distribution which has not previously appeared in mixture-modelling literature. Allowing a constraint on the factor loading matrix leads to two mixtures of skew-t factor analyzers models. These models are implemented using the alternating expectation-conditional maximization algorithm for parameter estimation with an Aitken's acceleration stopping criterion used to determine convergence. The Bayesian information criterion is used for model selection and the performance of each model is assessed using the adjusted Rand index. The models are applied to both real and simulated data, obtaining clustering results which are equivalent or superior to those of established clustering methods.
236	Assessing the impact of human activities on British Columbia's estuaries Robb, Carolyn Kathleen 01 August 2013 (has links) The world's marine and coastal ecosystems are under threat and single-sector management efforts have failed to address those threats. Scientific consensus suggests that management should evolve to focus on ecosystems and their human, ecological, and physical components. Estuaries are recognized globally as one of the world's most productive and most threatened ecosystems and many estuarine areas in British Columbia (BC) have been lost or degraded. To help prioritize activities and areas for management efforts at a regional scale, spatial information on human activities that adversely affect BC estuaries was compiled. Using statistical analyses, estuaries were assigned to groups facing related threats that could benefit from similar management. The relationships between estuaries, their biological importance, and their protected status were examined. This research is timely as it will inform ongoing marine planning efforts as well as land acquisition and stewardship activities undertaken by organizations such as the Pacific Estuary Conservation Program. British Columbia Cluster analysis Conservation Estuaries GIS Human activities
237	Automatic text summarization using lexical chains : algorithms and experiments Kolla, Maheedhar, University of Lethbridge. Faculty of Arts and Science January 2004 (has links) Summarization is a complex task that requires understanding of the document content to determine the importance of the text. Lexical cohesion is a method to identify connected portions of the text based on the relations between the words in the text. Lexical cohesive relations can be represented using lexical chaings. Lexical chains are sequences of semantically related words spread over the entire text. Lexical chains are used in variety of Natural Language Processing (NLP) and Information Retrieval (IR) applications. In current thesis, we propose a lexical chaining method that includes the glossary relations in the chaining process. These relations enable us to identify topically related concepts, for instance dormitory and student, and thereby enhances the identification of cohesive ties in the text. We then present methods that use the lexical chains to generate summaries by extracting sentences from the document(s). Headlines are generated by filtering the portions of the sentences extracted, which do not contribute towards the meaning of the sentence. Headlines generated can be used in real world application to skim through the document collections in a digital library. Multi-document summarization is gaining demand with the explosive growth of online news sources. It requires identification of the several themes present in the collection to attain good compression and avoid redundancy. In this thesis, we propose methods to group the portions of the texts of a document collection into meaningful clusters. clustering enable us to extract the various themes of the document collection. Sentences from clusters can then be extracted to generate a summary for the multi-document collection. Clusters can also be used to generate summaries with respect to a given query. We designed a system to compute lexical chains for the given text and use them to extract the salient portions of the document. Some specific tasks considered are: headline generation, multi-document summarization, and query-based summarization. Our experimental evaluation shows that efficient summaries can be extracted for the above tasks. / viii, 80 leaves : ill. ; 29 cm. Dissertations, Academic Automatic abstracting Cluster analysis -- Computer programs Computational linguistics
238	Aggregation in large scale quadratic programming Foster, David Martin 08 1900 (has links) No description available. Cluster analysis Data processing Linear programming Quadratic programming
239	Factors influencing consumers' life insurance purchasing decisions in China Wang, Huihui 22 September 2010 (has links) The Chinese insurance industry has been growing substantially, and this provides a motivation to examine the insurance market in China. This study used survey data to identify key determinants related to Chinese consumers’ ownership of life insurance, by using a probit model. The results revealed that several groups of variables influence Chinese consumers’ life insurance purchases, including knowledge and trust, consumer profile and investment preferences, importance of product attributes, and socio-demographics. Also, this study applied factor analysis to investigate factors that are important for Chinese consumers regarding life insurance. Factor analysis results indicated that four factors are identified including importance of product attributes, consumers’ financial strength, consumers’ attitude and trust toward the life insurance industry, and consumer attributes. Lastly, to better understand Chinese consumers regarding life insurance, consumers were segmented into three main groups through applying cluster analysis. Each cluster shows distinct differences in purchasing criteria and socio-demographic characteristics. life insurance survey China consumer probit factor analysis cluster analysis
240	Tätortsklassificering utifrån servicebredd och servicegrad : En klusteranalys av Sveriges tätorter Andersson, Stina-Kajsa January 2014 (has links) Statistics Sweden is an administrative agency that delimits built-up areas and produces statistics regarding them. The statistics provide information about the area of the built-up areas, their population number, number of gainfully employees working in the built-up areas, and of buildings. Now Statistics Sweden wishes to extend such statistics by producing a measure regarding how well developed the service is in each built-up area. This study is a contribution to this statistical improvement work and the purpose is to – by employing geographical information systems and cluster analysis – classify the Swedish built-up areas according to 1) service width and 2) service degree. A particular built-up area has a high service width if it has many different service functions, such as pharmacies, schools and grocery stores. It has a high service degree if it has many service functions per 1000 inhabitants. The result consists of two different “urban hierarchies”, one in which one can identify the level of service width of each built-up area and one in which one can position each built-up area according to its service degree. This study shows that built-up areas with a high service width also have manyinhabitants. In contrast, this is not the case for built-up areas with a high service degree: built-up areas with high service degree have relatively few inhabitants. The study shows that built-up areas with high service degree have a higher quota number of people employed in the locality / number of residents, which indicates that these built-up areas are “commuting localities” – built-up areas where people work but not necessarily live. The results from the two separate modes of classification also show that the service width and service degree do not display a positive correlation. Built-up areas with high service degree are thus not the same built-up areas that those scoring high onservice width; if anything, the relationship is rather the opposite. classification cluster analysis GIS settlement hierarchy klassificering klusteranalys GIS tätortshierarki

Search results