• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 381
  • 185
  • 126
  • 31
  • 24
  • 24
  • 20
  • 20
  • 16
  • 14
  • 9
  • 8
  • 8
  • 4
  • 4
  • Tagged with
  • 946
  • 946
  • 148
  • 136
  • 130
  • 117
  • 68
  • 68
  • 67
  • 56
  • 52
  • 52
  • 49
  • 45
  • 45
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
291

Sustainable quality versus quantity metropolitan area : an exploratory analysis

Zhang, Guowei, master of science in community and regional planning 07 July 2011 (has links)
In recent years, there has been a growing interest in understanding how and why cities grow and how to make this growth more economically and environmentally sustainable. This study is interested in two questions. The first question is how to trace the growth pattern among U.S metropolitan areas after 2000 based on the two types of growth strategies. The second research question is how different growth patterns affect environmental outcomes and income inequality. A quantitative study is used to measure the two faces of urban growth processes in U.S metropolitan areas. After cluster analysis, five groups are developed. Then the study moves to how these groups might impact the performances of sustainability. This thesis closes by summarizing the empirical finding and gives recommendations for future researches. / text
292

"Clustering Categorical Response" Application to Lung Cancer Problems in Living Scales

Guo, Ling 22 April 2008 (has links)
The study aims to estimate the ability of different grouping techniques on categorical response. We try to find out how well do they work? Do they really find clusters when clusters exist? We use Cancer Problems in Living Scales from the ACS as our categorical data variables and lung cancer survivors as our studying group. Five methods of cluster analysis are examined for their accuracy in clustering on both real CPILS dataset and simulated data. The methods include hierarchical cluster analysis (Ward's method), model-based clustering of raw data, model-based clustering of the factors scores from a maximum likelihood factor analysis, model-based clustering of the predicted scores from independent factor analysis, and the method of latent class clustering. The results from each of the five methods are then compared to actual classifications. The performance of model-based clustering on raw data is poorer than that of the other methods and the latent class clustering method is most appropriate for the specific categorical data examined. These results are discussed and recommendations are made regarding future directions for cluster analysis research.
293

Virtualios nuotolinio mokymo aplinkos duomenų gavyba / Data Mining in Virtual Learning Environment

Lapukaitė, Daiva 27 August 2009 (has links)
Nuotolinių aplinkų duomenų bazėse kaupiamas didelis kiekis informacijos apie studentus ir jų veiksmus nuotolinėje aplinkoje. Kad būtų paprasčiau analizuoti šiuos duomenis į pagalbą pasitelkiama duomenų gavyba. Darbo tikslas - sudaryti sistemą, skirtą duomenų, gautų iš virtualios nuotolinio mokymo aplinkos Moodle, pirminiam apdorojimui ir duomenų gavybai. Gautus duomenis ištirti pritaikant programinės įrangos paketo StatSoft STATISTICA 7 duomenų gavybos algoritmus besimokančiųjų mokymosi intensyvumo duomenų analizei. Įvertinus gautus rezultatus parengtos rekomendacijos tolimesnei duomenų analizei. Duomenų analizei pritaikyta klasterinė k-vidurkių analizė. / The databases of virtual learning environments store large quantity of information about students and theirs activity. The data mining is usable to easer analysis of these information. The object of work is to make a system for preprocessing and data mining of data, obtained from virtual learning environment Moodle. The historical learning data can be analysed after preprocessing to study learners learning intensity with data mining algorithms by StatSoft STATISTICA 7 software. k-means cluster analysis was applied as example of data mining of learning data. Recommendations to further application of data mining of learning activities are given, too.
294

Perfectionism, Self-Injurious Behaviour, and Functions of Anorexia Nervosa

Csuzdi, Nicklaus 13 December 2011 (has links)
The following thesis outlines a study assessing the levels of perfectionism, self-injurious behaviour, and functions of anorexia nervosa (AN) through use of a cross-sectional online survey, among English speaking participants 15 years or older, self-reporting a current, previous, or suspected diagnosis of AN. Three distinct clusters were found using self-report measures from individuals with a current or suspected diagnosis, with each cluster corresponding to a unique theoretical understanding of AN. The three clusters can be distinguished by high asceticism, appearance, and avoidance of fertility/sexuality functions for AN respectively. Two distinct clusters were found for participants with a previous diagnosis of AN. These clusters can be differentiated by lingering sentiments held for the condition, as the first cluster viewed AN negatively, and the second cluster continued to see some benefits of the condition. Possible implications for understanding etiology, mechanisms, and treatment of AN are discussed. / Canadian Institute of Health Research
295

An Evaluation of Biosecurity Practices on Southern Ontario Swine Farms, and its Application to Risk-Based Surveillance Approaches

Bottoms, Katherine 11 May 2012 (has links)
This thesis is an investigation of external biosecurity and its application to risk-based surveillance approaches in the southern Ontario swine industry. In each of two datasets, the best number of groups to describe biosecurity practices was identified, resulting in two groups with high biosecurity standards and one group with low biosecurity standards. Multinomial logistic regression models identified herd density, herd size, and herd type among significant predictors of biosecurity group membership. A map of southern Ontario that can be used as a tool in the risk-based surveillance of contagious swine diseases was developed using geographic information about swine density, and the distribution of herds belonging to the high biosecurity groups. Finally, multiple correspondence analysis examined how individual biosecurity practices form strategies on sow farms. Some practices that are generally considered high-risk were closely associated with other practices that mitigate the risk, suggesting that evaluation of the overall strategy is essential for complete assessment of biosecurity. / The Ontario Ministry of Agriculture, Food and Rural Affairs (under the Emergency Management research theme); Ontario Pork; the Ontario Pork Industry Council's Swine Health Advisory Board; the Natural Sciences and Engineering Research Council of Canada;
296

Model-based Learning: t-Families, Variable Selection, and Parameter Estimation

Andrews, Jeffrey Lambert 27 August 2012 (has links)
The phrase model-based learning describes the use of mixture models in machine learning problems. This thesis focuses on a number of issues surrounding the use of mixture models in statistical learning tasks: including clustering, classification, discriminant analysis, variable selection, and parameter estimation. After motivating the importance of statistical learning via mixture models, five papers are presented. For ease of consumption, the papers are organized into three parts: mixtures of multivariate t-families, variable selection, and parameter estimation. / Natural Sciences and Engineering Research Council of Canada through a doctoral postgraduate scholarship.
297

Automatic text summarization in digital libraries

Mlynarski, Angela, University of Lethbridge. Faculty of Arts and Science January 2006 (has links)
A digital library is a collection of services and information objects for storing, accessing, and retrieving digital objects. Automatic text summarization presents salient information in a condensed form suitable for user needs. This thesis amalgamates digital libraries and automatic text summarization by extending the Greenstone Digital Library software suite to include the University of Lethbridge Summarizer. The tool generates summaries, nouns, and non phrases for use as metadata for searching and browsing digital collections. Digital collections of newspapers, PDFs, and eBooks were created with summary metadata. PDF documents were processed the fastest at 1.8 MB/hr, followed by the newspapers at 1.3 MB/hr, with eBooks being the slowest at 0.9 MV/hr. Qualitative analysis on four genres: newspaper, M.Sc. thesis, novel, and poetry, revealed narrative newspapers were most suitable for automatically generated summarization. The other genres suffered from incoherence and information loss. Overall, summaries for digital collections are suitable when used with newspaper documents and unsuitable for other genres. / xiii, 142 leaves ; 28 cm.
298

Market segmentation and factors affecting stock returns on the JSE.

Chimanga, Artwell S. January 2008 (has links)
<p><font face="F59" size="3"><font face="F59" size="3"> <p align="left">This study examines the relationship between stock returns and market segmentation. Monthly returns of stocks listed on the JSE from 1997-2007 are analysed using mostly the analytic factor and cluster analysis techniques. Evidence supporting the use of multi-index models in explaining the return generating process on the JSE is found. The results provide additional support for Van Rensburg (1997)'s hypothesis on market segmentation on the JSE.</p> </font></font></p>
299

Nonnegative matrix factorization for clustering

Kuang, Da 27 August 2014 (has links)
This dissertation shows that nonnegative matrix factorization (NMF) can be extended to a general and efficient clustering method. Clustering is one of the fundamental tasks in machine learning. It is useful for unsupervised knowledge discovery in a variety of applications such as text mining and genomic analysis. NMF is a dimension reduction method that approximates a nonnegative matrix by the product of two lower rank nonnegative matrices, and has shown great promise as a clustering method when a data set is represented as a nonnegative data matrix. However, challenges in the widespread use of NMF as a clustering method lie in its correctness and efficiency: First, we need to know why and when NMF could detect the true clusters and guarantee to deliver good clustering quality; second, existing algorithms for computing NMF are expensive and often take longer time than other clustering methods. We show that the original NMF can be improved from both aspects in the context of clustering. Our new NMF-based clustering methods can achieve better clustering quality and run orders of magnitude faster than the original NMF and other clustering methods. Like other clustering methods, NMF places an implicit assumption on the cluster structure. Thus, the success of NMF as a clustering method depends on whether the representation of data in a vector space satisfies that assumption. Our approach to extending the original NMF to a general clustering method is to switch from the vector space representation of data points to a graph representation. The new formulation, called Symmetric NMF, takes a pairwise similarity matrix as an input and can be viewed as a graph clustering method. We evaluate this method on document clustering and image segmentation problems and find that it achieves better clustering accuracy. In addition, for the original NMF, it is difficult but important to choose the right number of clusters. We show that the widely-used consensus NMF in genomic analysis for choosing the number of clusters have critical flaws and can produce misleading results. We propose a variation of the prediction strength measure arising from statistical inference to evaluate the stability of clusters and select the right number of clusters. Our measure shows promising performances in artificial simulation experiments. Large-scale applications bring substantial efficiency challenges to existing algorithms for computing NMF. An important example is topic modeling where users want to uncover the major themes in a large text collection. Our strategy of accelerating NMF-based clustering is to design algorithms that better suit the computer architecture as well as exploit the computing power of parallel platforms such as the graphic processing units (GPUs). A key observation is that applying rank-2 NMF that partitions a data set into two clusters in a recursive manner is much faster than applying the original NMF to obtain a flat clustering. We take advantage of a special property of rank-2 NMF and design an algorithm that runs faster than existing algorithms due to continuous memory access. Combined with a criterion to stop the recursion, our hierarchical clustering algorithm runs significantly faster and achieves even better clustering quality than existing methods. Another bottleneck of NMF algorithms, which is also a common bottleneck in many other machine learning applications, is to multiply a large sparse data matrix with a tall-and-skinny dense matrix. We use the GPUs to accelerate this routine for sparse matrices with an irregular sparsity structure. Overall, our algorithm shows significant improvement over popular topic modeling methods such as latent Dirichlet allocation, and runs more than 100 times faster on data sets with millions of documents.
300

Clusteranalyse der Gemeinden in der Kernregion Mitteldeutschland

Geyler, Stefan, Warner, Barbara, Brandl, Anja, Kuntze, Martina 19 September 2014 (has links) (PDF)
Der hier vorgelegte Band befasst sich mit einer Typisierung der Gemeinden in der Kernregion Mitteldeutschland, die im Rahmen einer Clusteranalyse durchgeführt wurde. Dieses multivariate Verfahren integriert Aspekte der Raumstruktur, der demographischen und wirtschaftlichen Entwicklung, der technischen und verkehrlichen Infrastruktur sowie der öffentlichen Finanzen. Die 16 aus einem größeren Datenset ausgewählten Kennzahlen fokussieren wichtige Entwicklungsverläufe, die derzeitige Situation sowie die Rahmenbedingungen der einzelnen Gemeinden. Ziel ist es, auf dieser Grundlage Gemeinden mit ähnlicher Merkmalsausprägung zu gruppieren, um auf dieser Basis Referenzgemeinden mit exemplarischen Ausgangsbedingungen und Problemstellungen zu identifizieren. Mit diesen sollen im weiteren Forschungsverlauf planerische und kommunalpolitische Zielkonflikte analysiert und instrumentelle Möglichkeiten zur Reduzierung der Inanspruchnahme von Flächen für Wohnen, Gewerbe und Verkehr durch stärkere interkommunale Kooperation erarbeitet werden.

Page generated in 0.0495 seconds