Global ETD Search

141	HARP: a practical projected clustering algorithm for mining gene expression data Yip, Yuk-Lap, Kevin., 葉旭立. January 2003 (has links) published_or_final_version / abstract / toc / Computer Science and Information Systems / Master / Master of Philosophy Gene expression - Data processing. Computer algorithms. Cluster analysis. Data mining.
142	Ab initio calculations of silicon clusters 黃新祥, Wong, Sun-cheung. January 1998 (has links) published_or_final_version / Physics / Master / Master of Philosophy Electronic structure. Cluster analysis.
143	Assessing the impact of human activities on British Columbia's estuaries Robb, Carolyn Kathleen 01 August 2013 (has links) The world's marine and coastal ecosystems are under threat and single-sector management efforts have failed to address those threats. Scientific consensus suggests that management should evolve to focus on ecosystems and their human, ecological, and physical components. Estuaries are recognized globally as one of the world's most productive and most threatened ecosystems and many estuarine areas in British Columbia (BC) have been lost or degraded. To help prioritize activities and areas for management efforts at a regional scale, spatial information on human activities that adversely affect BC estuaries was compiled. Using statistical analyses, estuaries were assigned to groups facing related threats that could benefit from similar management. The relationships between estuaries, their biological importance, and their protected status were examined. This research is timely as it will inform ongoing marine planning efforts as well as land acquisition and stewardship activities undertaken by organizations such as the Pacific Estuary Conservation Program. British Columbia Cluster analysis Conservation Estuaries GIS Human activities
144	Automatic text summarization using lexical chains : algorithms and experiments Kolla, Maheedhar, University of Lethbridge. Faculty of Arts and Science January 2004 (has links) Summarization is a complex task that requires understanding of the document content to determine the importance of the text. Lexical cohesion is a method to identify connected portions of the text based on the relations between the words in the text. Lexical cohesive relations can be represented using lexical chaings. Lexical chains are sequences of semantically related words spread over the entire text. Lexical chains are used in variety of Natural Language Processing (NLP) and Information Retrieval (IR) applications. In current thesis, we propose a lexical chaining method that includes the glossary relations in the chaining process. These relations enable us to identify topically related concepts, for instance dormitory and student, and thereby enhances the identification of cohesive ties in the text. We then present methods that use the lexical chains to generate summaries by extracting sentences from the document(s). Headlines are generated by filtering the portions of the sentences extracted, which do not contribute towards the meaning of the sentence. Headlines generated can be used in real world application to skim through the document collections in a digital library. Multi-document summarization is gaining demand with the explosive growth of online news sources. It requires identification of the several themes present in the collection to attain good compression and avoid redundancy. In this thesis, we propose methods to group the portions of the texts of a document collection into meaningful clusters. clustering enable us to extract the various themes of the document collection. Sentences from clusters can then be extracted to generate a summary for the multi-document collection. Clusters can also be used to generate summaries with respect to a given query. We designed a system to compute lexical chains for the given text and use them to extract the salient portions of the document. Some specific tasks considered are: headline generation, multi-document summarization, and query-based summarization. Our experimental evaluation shows that efficient summaries can be extracted for the above tasks. / viii, 80 leaves : ill. ; 29 cm. Dissertations, Academic Automatic abstracting Cluster analysis -- Computer programs Computational linguistics
145	Factors influencing consumers' life insurance purchasing decisions in China Wang, Huihui 22 September 2010 (has links) The Chinese insurance industry has been growing substantially, and this provides a motivation to examine the insurance market in China. This study used survey data to identify key determinants related to Chinese consumers’ ownership of life insurance, by using a probit model. The results revealed that several groups of variables influence Chinese consumers’ life insurance purchases, including knowledge and trust, consumer profile and investment preferences, importance of product attributes, and socio-demographics. Also, this study applied factor analysis to investigate factors that are important for Chinese consumers regarding life insurance. Factor analysis results indicated that four factors are identified including importance of product attributes, consumers’ financial strength, consumers’ attitude and trust toward the life insurance industry, and consumer attributes. Lastly, to better understand Chinese consumers regarding life insurance, consumers were segmented into three main groups through applying cluster analysis. Each cluster shows distinct differences in purchasing criteria and socio-demographic characteristics. life insurance survey China consumer probit factor analysis cluster analysis
146	Tätortsklassificering utifrån servicebredd och servicegrad : En klusteranalys av Sveriges tätorter Andersson, Stina-Kajsa January 2014 (has links) Statistics Sweden is an administrative agency that delimits built-up areas and produces statistics regarding them. The statistics provide information about the area of the built-up areas, their population number, number of gainfully employees working in the built-up areas, and of buildings. Now Statistics Sweden wishes to extend such statistics by producing a measure regarding how well developed the service is in each built-up area. This study is a contribution to this statistical improvement work and the purpose is to – by employing geographical information systems and cluster analysis – classify the Swedish built-up areas according to 1) service width and 2) service degree. A particular built-up area has a high service width if it has many different service functions, such as pharmacies, schools and grocery stores. It has a high service degree if it has many service functions per 1000 inhabitants. The result consists of two different “urban hierarchies”, one in which one can identify the level of service width of each built-up area and one in which one can position each built-up area according to its service degree. This study shows that built-up areas with a high service width also have manyinhabitants. In contrast, this is not the case for built-up areas with a high service degree: built-up areas with high service degree have relatively few inhabitants. The study shows that built-up areas with high service degree have a higher quota number of people employed in the locality / number of residents, which indicates that these built-up areas are “commuting localities” – built-up areas where people work but not necessarily live. The results from the two separate modes of classification also show that the service width and service degree do not display a positive correlation. Built-up areas with high service degree are thus not the same built-up areas that those scoring high onservice width; if anything, the relationship is rather the opposite. classification cluster analysis GIS settlement hierarchy klassificering klusteranalys GIS tätortshierarki
147	Factors influencing consumers' life insurance purchasing decisions in China Wang, Huihui 22 September 2010 (has links) The Chinese insurance industry has been growing substantially, and this provides a motivation to examine the insurance market in China. This study used survey data to identify key determinants related to Chinese consumers’ ownership of life insurance, by using a probit model. The results revealed that several groups of variables influence Chinese consumers’ life insurance purchases, including knowledge and trust, consumer profile and investment preferences, importance of product attributes, and socio-demographics. Also, this study applied factor analysis to investigate factors that are important for Chinese consumers regarding life insurance. Factor analysis results indicated that four factors are identified including importance of product attributes, consumers’ financial strength, consumers’ attitude and trust toward the life insurance industry, and consumer attributes. Lastly, to better understand Chinese consumers regarding life insurance, consumers were segmented into three main groups through applying cluster analysis. Each cluster shows distinct differences in purchasing criteria and socio-demographic characteristics. life insurance survey China consumer probit factor analysis cluster analysis
148	Remaining within-cluster heterogeneity: a meta-analysis of the "dark side" of clustering methods Franke, Nikolaus, Reisinger, Heribert, Hoppe, Daniel 04 1900 (has links) (PDF) In a meta-analysis of articles employing clustering methods, we find that little attention is paid to remaining within-cluster heterogeneity and that average values are relatively high. We suggest addressing this potentially problematic "dark side" of cluster analysis by providing two coefficients as standard information in any cluster analysis findings: a goodness-of-fit measure and a measure which relates explained variation of analysed empirical data to explained variation of simulated random data. The second coefficient is referred to as the Index of Clustering Appropriateness (ICA). Finally, we develop a classification scheme depicting acceptable levels of remaining within-cluster heterogeneity. (authors' abstract)
149	Evaluating Clusterings by Estimating Clarity Whissell, John January 2012 (has links) In this thesis I examine clustering evaluation, with a subfocus on text clusterings specifically. The principal work of this thesis is the development, analysis, and testing of a new internal clustering quality measure called informativeness. I begin by reviewing clustering in general. I then review current clustering quality measures, accompanying this with an in-depth discussion of many of the important properties one needs to understand about such measures. This is followed by extensive document clustering experiments that show problems with standard clustering evaluation practices. I then develop informativeness, my new internal clustering quality measure for estimating the clarity of clusterings. I show that informativeness, which uses classification accuracy as a proxy for human assessment of clusterings, is both theoretically sensible and works empirically. I present a generalization of informativeness that leverages external clustering quality measures. I also show its use in a realistic application: email spam filtering. I show that informativeness can be used to select clusterings which lead to superior spam filters when few true labels are available. I conclude this thesis with a discussion of clustering evaluation in general, informativeness, and the directions I believe clustering evaluation research should take in the future. clustering evaluating clustering cluster validation cluster analysis Computer Science
150	New algorithms for EST clustering. Ptitsyn, Andrey January 2000 (has links) Expressed sequence tag database is a rich and fast growing source of data for gene expression analysis and drug discovery. Clustering of raw EST data is a necessary step for further analysis and one of the most challenging problems of modem computational biology. Cluster analysis Data processing Cluster analysis Computer programs Algorithms

Search results