• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 131
  • 39
  • 33
  • 21
  • 11
  • 9
  • 9
  • 7
  • 6
  • 4
  • 4
  • 2
  • 2
  • 2
  • 1
  • Tagged with
  • 316
  • 316
  • 160
  • 66
  • 62
  • 58
  • 44
  • 44
  • 37
  • 37
  • 36
  • 35
  • 35
  • 33
  • 30
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Exploring the weather impact on bike sharing usage through a clustering analysis

Quach, Jessica January 2020 (has links)
Today bike sharing systems exists in many cities around the globe after a recent growth and popularity in the last decades. It is attractive for cities and users who wants to promote healthier lifestyles; to reduce air pollution and gas emission as well as improve traffic. One major challenge to docked bike sharing system is redistributing bikes and balancing dock stations. There are studies that propose models that can help forecasting bike usage; strategies for rebalancing bike distribution; establish patterns or how to identify patterns. Some of these studies proposes to extend the approach by including weather data. Some had limitations and did not include weather data. This study aims to extend upon these proposals and opportunities to explore on how and in what magnitude weather impacts bike usage. Bike usage data and weather data are gathered for the city of Washington D.C. and are analyzed by using a clustering algorithm called k-means. K-means is suitable for discovering patterns within the data by grouping (clustering) similar instances, which literature review also advocated. In this project, the k-means algorithm managed to identify three clusters that corresponds to bike usage depending on weather. The results show that weather impact on bike usage was noticeable between clusters. It showed that temperature followed by precipitation weighted the most, out of five weather variables. Results also supported that the use of k-means was appropriate for this type of study.
132

Vocation Clustering for Heavy-Duty Vehicles

Kobold, Daniel, Jr. 12 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The identification of the vocation of an unknown heavy-duty vehicle is valuable to parts manufacturers who may not have otherwise access to this information on a consistent basis. This study proposes a methodology for vocation identification that is based on clustering techniques. Two clustering algorithms are considered: K-Means and Expectation Maximization. These algorithms are used to first construct the operating profile of each vocation from a set of vehicles with known vocations. The vocation of an unknown vehicle is then determined using different assignment methods. These methods fall under two main categories: one-versus-all and one-versus-one. The one-versus-all approach compares an unknown vehicle to all potential vocations. The one-versus-one approach compares the unknown vehicle to two vocations at a time in a tournament fashion. Two types of tournaments are investigated: round-robin and bracket. The accuracy and efficiency of each of the methods is evaluated using the NREL FleetDNA dataset. The study revealed that some of the vocations may have unique operating profiles and are therefore easily distinguishable from others. Other vocations, however, can have confounding profiles. This indicates that different vocations may benefit from profiles with varying number of clusters. Determining the optimal number of clusters for each vocation can not only improve the assignment accuracy, but also enhance the computational efficiency of the application. The optimal number of clusters for each vocation is determined using both static and dynamic techniques. Static approaches refer to methods that are completed prior to training and may require multiple iterations. Dynamic techniques involve clusters being split or removed during training. The results show that the accuracy of dynamic techniques is comparable to that of static approaches while benefiting from a reduced computational time.
133

Exploring Equity and Resilience of Transportation Network through Modeling Travel Behavior: A Study of OKI Region

Hu, Yajie 09 July 2019 (has links)
No description available.
134

A Concave Pairwise Fusion Approach to Clustering of Multi-Response Regression and Its Robust Extensions

Chen, Chen, 0000-0003-1175-3027 January 2022 (has links)
Solution-path convex clustering is combined with concave penalties by Ma and Huang (2017) to reduce clustering bias. Their method was introduced in the setting of single-response regression to handle heterogeneity. Such heterogeneity may come from either the regression intercepts or the regression slopes. The procedure, realized by the alternating direction method of multipliers (ADMM) algorithm, can simultaneously identify the grouping structure of observations and estimate regression coefficients. In the first part of our work, we extend this procedure to multi-response regression. We propose models to solve cases with heterogeneity in either the regression intercepts or the regression slopes. We combine the existing gadgets of the ADMM algorithm and group-wise concave penalties to find solutions for the model. Our work improves model performance in both clustering accuracy and estimation accuracy. We also demonstrate the necessity of such extension through the fact that by utilizing information in multi-dimensional space, the performance can be greatly improved. In the second part, we introduce robust solutions to our proposed work. We introduce two approaches to handle outliers or long-tail distributions. The first is to replace the squared loss with robust loss, among which are absolute loss and Huber loss. The second is to characterize and remove outliers' effects by a mean-shift vector. We demonstrate that these robust solutions outperform the squared loss based method when outliers are present, or the underlying distribution is long-tailed. / Statistics
135

Differentiation between "Bomb" and Ordinary U.S. East Coast Cyclogenesis using Principal Component Analysis and K-means Cluster Analysis

Thomas, Evan Edward 12 May 2012 (has links)
The purpose of this research is to identify whether synoptic patterns and variables were statistically significantly different between East Coast United States track bomb and ordinary cyclogenesis. The differentiation of East Coast track bomb and ordinary cyclogenesis was completed through the utility of the principal component analysis, a K-means cluster analysis, a subjective composite analysis, and permutation tests. The principal component analysis determined that there were three leading modes of variability within the bomb and ordinary composites. The K-means cluster analysis was used to cluster these leading patterns of variability into three distinct clusters for the bomb and ordinary cyclones. The subjective composite analysis, created by averaging all the variables from each cyclone in each cluster, identified several synoptic variables and patterns to be objectively compared through permutation tests. The permutation tests revealed that synoptic variables and patterns associated with bomb cyclogenesis statistically significantly differ from ordinary cyclogenesis.
136

Quantifying Trust in Deep Learning Ultrasound Models by Investigating Hardware and Operator Variance

Zhu, Calvin January 2021 (has links)
Ultrasound (US) is the most widely used medical imaging modality due to its low cost, portability, real time imaging ability and use of non-ionizing radiation. However, unlike other imaging modalities such as CT or MRI, it is a heavily operator dependent, requiring trained expertise to leverage these benefits. Recently there has been an explosion of interest in artificial intelligence (AI) across the medical community and many are turning to the growing trend of deep learning (DL) models to assist in diagnosis. However, deep learning models do not perform as well when training data is not fully representative of the problem. Due to this difference in training and deployment, model performance suffers which can lead to misdiagnosis. This issue is known as dataset shift. Two aims to address dataset shift were proposed. The first was to quantify how US operator skill and hardware affects acquired images. The second was to use this skill quantification method to screen and match data to deep learning models to improve performance. A BLUE phantom from CAE Healthcare (Sarasota, FL) with various mock lesions was scanned by three operators using three different US systems (Siemens S3000, Clarius L15, and Ultrasonix SonixTouch) producing 39013 images. DL models were trained on a specific set to classify the presence of a simulated tumour and tested with data from differing sets. The Xception, VGG19, and ResNet50 architectures were used to test the effects with varying frameworks. K-Means clustering was used to separate images generated by operator and hardware into clusters. This clustering algorithm was then used to screen incoming images during deployment to best match input to an appropriate DL model which is trained specifically to classify that type of operator or hardware. Results showed a noticeable difference when models were given data from differing datasets with the largest accuracy drop being 81.26% to 31.26%. Overall, operator differences more significantly affected DL model performance. Clustering models had much higher success separating hardware data compared to operator data. The proposed method reflects this result with a much higher accuracy across the hardware test set compared to the operator data. / Thesis / Master of Applied Science (MASc)
137

JÄMFÖRELSE MELLAN OBJEKTORIENTERAD OCH DATAORIENTERAD DESIGN AV ELKUNDSDATA / COMPARISON BETWEEN OBJECT-ORIENTED AND DATA-ORIENTED DESIGN OF ELECTRICITY CUSTOMER DATA

Ljung, Andreas January 2023 (has links)
Syftet med studien är att undersöka om det går att vinna fördelar i prestanda genom att lagra data för två webbapplikationer på ett dataorienterat sätt kontra det mer klassiska objektorienterade sättet. Grundanledningen till studien är att det har upptäckts att ett dataorienterat programmeringstänk genererat prestandafördelar vad det gäller datahanteringen inom dataspelsindustrin. För att genomföra denna studie skapas två webbapplikationer som lagrar fiktiv data över kunders elkonsumtion. I nästa led klustras datan med en k-means klustringsalgoritm och exekveringstid för detta mäts och redovisas. Olika stora mängder data genererades i studien och det går det att påvisa att den dataorienterade designen av datan ger fördelar över den objektorienterade datan vad det gäller exekveringstiden. För framtida arbete så kan det vara intressant att titta på ännu större datamängder och eventuellt använda sig av fler dimensioner för att se om det skulle kunna skapa än större fördelar med en dataorienterad design kontra en objektorienterad design för webbapplikationers data.
138

Initialization of the k-means algorithm : A comparison of three methods

Jorstedt, Simon January 2023 (has links)
k-means is a simple and flexible clustering algorithm that has remained in common use for 50+ years. In this thesis, we discuss the algorithm in general, its advantages, weaknesses and how its ability to locate clusters can be enhanced with a suitable initialization method. We formulate appropriate requirements for the (batched) UnifRandom, k-means++ and Kaufman initialization methods and compare their performance on real and generated data through simulations. We find that all three methods (followed by the k-means procedure) are able to accurately locate at least up to nine well-separated clusters, but the appropriately batched UnifRandom and the Kaufman methods are both significantly more computationally expensive than the k-means++ method already for K = 5 clusters in a dataset of N = 1000 points.
139

Heuristic Clustering Methods for Solving Vehicle Routing Problems

Nordqvist, Georgios, Forsberg, Erik January 2023 (has links)
Vehicle Routing Problems are optimization problems centered around determining optimal travel routes for a fleet of vehicles to visit a set of nodes. Optimality is evaluated with regard to some desired quality of the solution, such as time-minimizing or cost-minimizing. There are many established solution methods which makes it meaningful to compare their performance. This thesis aims to investigate how the performances of various solution methods is affected by varying certain problem parameters. Problem characteristics such as the number of customers, vehicle capacity, and customer demand are investigated. The aim was approached by dividing the problem into two subproblems: distributing the nodes into suitable clusters, and finding the shortest route within each cluster. Results were produced by solving simulated sets of customers for different parameter values with different clustering methods, namely sweep, k-means and hierarchical clustering. Although the model required simplifications to facilitate the implementation, theresults provided some significant findings. The thesis concludes that for large vehicle capacity in relation to demand, sweep clustering is the preferred method. Whereas for smaller vehicles, the other two methods perform better.
140

High-dimensional Data Clustering and Statistical Analysis of Clustering-based Data Summarization Products

Zhou, Dunke 27 June 2012 (has links)
No description available.

Page generated in 0.0595 seconds