Global ETD Search

1	[en] COMBINING TO SUCCEED: A NOVEL STRATEGY TO IMPROVE FORECASTS FROM EXPONENTIAL SMOOTHING MODELS / [pt] COMBINANDO PARA TER SUCESSO: UMA NOVA ESTRATÉGIA PARA MELHORAR A PREVISÕES DE MODELOS DE AMORTECIMENTO EXPONENCIAL TIAGO MENDES DANTAS 04 February 2019 (has links) [pt] A presente tese se insere no contexto de previsão de séries temporais. Nesse sentido, embora muitas abordagens tenham sido desenvolvidas, métodos simples como o de amortecimento exponencial costumam gerar resultados extremamente competitivos muitas vezes superando abordagens com maior nível de complexidade. No contexto previsão, papers seminais na área mostraram que a combinação de previsões tem potencial para reduzir de maneira acentuada o erro de previsão. Especificamente, a combinação de previsões geradas por amortecimento exponencial tem sido explorada em papers recentes. Apesar da combinação de previsões utilizando Amortecimento Exponencial poder ser feita de diversas formas, um método proposto recentemente e chamado de Bagged.BLD.MBB.ETS utiliza uma técnica chamada Bootstrap Aggregating (Bagging) em combinação com métodos de amortecimento exponencial para gerar previsões mostrando que a abordagem é capaz de gerar previsões mensais mais precisas que todos os benchmarks analisados. A abordagem era considerada o estado da arte na utilização de Bagging e Amortecimento Exponencial até o desenvolvimento dos resultados obtidos nesta tese. A tese em questão se ocupa de, inicialmente, validar o método Bagged.BLD.MBB.ETS em um conjunto de dados relevante do ponto de vista de uma aplicação real, expandindo assim os campos de aplicação da metodologia. Posteriormente, são identificados motivos relevantes para redução do erro de e é proposta uma nova metodologia que utiliza Bagging, Amortecimento Exponencial e Clusters para tratar o efeito covariância, até então não identificado anteriormente na literatura do método. A abordagem proposta foi testada utilizando diferentes tipo de séries temporais da competição M3, CIF 2016 e M4, bem como utilizando dados simulados. Os resultados empíricos apontam para uma redução substancial na variância e no erro de previsão. / [en] This thesis is inserted in the context of time series forecasting. In this sense, although many approaches have been developed, simple methods such as exponential smoothing usually produce extremely competitive results, often surpassing approaches with a higher level of complexity. Seminal papers in time series forecasting showed that the combination of forecasts has the potential to dramatically reduce the forecast error. Specifically, the combination of forecasts generated by Exponential Smoothing has been explored in recent papers. Although this can be done in many ways, a specific method called Bagged.BLD.MBB.ETS uses a technique called Bootstrap Aggregating (Bagging) in combination with Exponential Smoothing methods to generate forecasts, showing that the approach can generate more accurate monthly forecasts than all the analyzed benchmarks. The approach was considered the state of the art in the use of Bagging and Exponential Smoothing until the development of the results obtained in this thesis. This thesis initially deals with validating Bagged.BLD.MBB.ETS in a data set relevant from the point of view of a real application, thus expanding the fields of application of the methodology. Subsequently, relevant motifs for error reduction are identified and a new methodology using Bagging, Exponential Smoothing and Clusters is proposed to treat the covariance effect, not previously identified in the method s literature. The proposed approach was tested using data from three time series competitions (M3, CIF 2016 and M4), as well as using simulated data. The empirical results point to a substantial reduction in variance and forecast error. [pt] AMORTECIMENTO EXPONENCIAL [en] EXPONENTIAL SMOOTHING [pt] BAGGING [en] BAGGING [pt] AGRUPAMENTO DE SERIES TEMPORAIS [en] CLUSTERING TIME SERIES [pt] PARTITIONING AROUND MEDOIDS [en] PARTITIONING AROUND MEDOIDS [pt] REDUCAO DE VARIANCIA [en] VARIANCE REDUCTION
2	Pattern analysis of the user behaviour in a mobile application using unsupervised machine learning / Mönsteranalys av användarbeteenden i en mobilapp med hjälp av oövervakad maskininlärning Hrstic, Dusan Viktor January 2019 (has links) Continuously increasing amount of logged data increases the possibilities of finding new discoveries about the user interaction with the application for which the data is logged. Traces from the data may reveal some specific user behavioural patterns which can discover how to improve the development of the application by showing the ways in which the application is utilized. In this thesis, unsupervised machine learning techniques are used in order to group the users depending on their utilization of SEB Privat Android mobile application. The user interactions in the applications are first extracted, then various data preprocessing techniques are implemented to prepare the data for clustering and finally two clustering algorithms, namely, HDBSCAN and KMedoids are performed to cluster the data. Three types of user behaviour have been found from both K-medoids and HDBSCAN algorithm. There are users that tend to interact more with the application and navigate through its deeper layers, then the ones that consider only a quick check of their account balance or transaction, and finally regular users. Among the resulting features chosen with the help of feature selection methods, 73 % of them are related to user behaviour. The findings can be used by the developers to improve the user interface and overall functionalities of application. The user flow can thus be optimized according to the patterns in which the users tend to navigate through the application. / En ständigt växande datamängd ökar möjligheterna att hitta nya upptäckter om användningen av en mobil applikation för vilken data är loggad. Spår som visas i data kan avslöja vissa specifika användarbeteenden som kan förbättra applikationens utveckling genom att antyda hur applikationen används. I detta examensarbete används oövervakade maskininlärningstekniker för att gruppera användarna beroende på deras bruk av SEB Privat Android mobilapplikation. Användarinteraktionerna i applikationen extraheras ut först, sedan används olika databearbetningstekniker för att förbereda data för klustringen och slutligen utförs två klustringsalgoritmer, nämligen HDBSCAN och Kmedoids för att gruppera data. Tre distinkta typer av användarbeteende har hittats från både K-medoids och HDBSCAN-algoritmen. Det finns användare som har en tendens att interagera mer med applikationen och navigera genom sitt djupare lager, sedan finns det de som endast snabbt kollar på deras kontosaldo eller transaktioner och till slut finns det vanliga användare. Bland de resulterande attributen som hade valts med hjälp av teknikerna för val av attribut, är 73% av dem relaterade till användarbeteendet. Det som upptäcktes i denna avhandling kan användas för att utvecklarna ska kunna förbättra användargränssnittet och övergripande funktioner i applikationen. Användarflödet kan därmed optimeras med hänsyn till de sätt enligt vilka användarna har en speciell tendens att navigera genom applikationen. Clustering HDBSCAN K-medoids data preprocessing user behaviour mobile application Klustring HDBSCAN K-medoids databearbetning användarbeteende mobila applikationer Computer and Information Sciences Data- och informationsvetenskap
3	Radio Resource Allocation and Beam Management under Location Uncertainty in 5G mmWave Networks Yao, Yujie 16 June 2022 (has links) Millimeter wave (mmWave) plays a critical role in the Fifth-generation (5G) new radio due to the rich bandwidth it provides. However, one shortcoming of mmWave is the substantial path loss caused by poor diffraction at high frequencies, and consequently highly directional beams are applied to mitigate this problem. A typical way of beam management is to cluster users based on their locations. However, localization uncertainty is unavoidable due to measurement accuracy, system performance fluctuation, and so on. Meanwhile, the traffic demand may change dynamically in wireless environments, which increases the complexity of network management. Therefore, a scheme that can handle both the uncertainty of localization and dynamic radio resource allocation is required. Moreover, since the localization uncertainty will influence the network performance, more state-of-the-art localization methods, such as vision-aided localization, are expected to reduce the localization error. In this thesis, we proposed two algorithms for joint radio resource allocation and beam management in 5G mmWave networks, namely UK-means-based Clustering and Deep Reinforcement Learning-based resource allocation (UK-DRL) and UK-medoids-based Clustering and Deep Reinforcement Learning-based resource allocation (UKM-DRL). Specifically, we deploy UK-means and UK-medoids clustering method in UK-DRL and UKM-DRL, respectively, which is designed to handle the clustering under location uncertainties. Meanwhile, we apply Deep Reinforcement Learning (DRL) for intra-beam radio resource allocations in UK-DRL and UKM-DRL. Moreover, to improve the localization accuracy, we develop a vision-aided localization scheme, where pixel characteristics-based features are extracted from satellite images as additional input features for location prediction. The simulations show that UK-DRL and UKM-DRL successfully improve the network performance in data rate and delay than baseline algorithms. When the traffic load is 4 Mbps, UK-DRL has a 172.4\% improvement in sum rate and 64.1\% improvement in latency than K-means-based Clustering and Deep Reinforcement Learning-based resource allocation (K-DRL). UKM-DRL has 17.2\% higher throughput and 7.7\% lower latency than UK-DRL, and 231\% higher throughput and 55.8\% lower latency than K-DRL. On the other hand, the vision-aided localization scheme can significantly reduce the localization error from 17.11 meters to 3.6 meters. Beam Management Location Uncertainty UK-means UK-medoids Vision-aided Deep Reinforcement Learning Radio Resource allocation
4	Algoritmy pro shlukování textových dat / Text data clustering algorithms Sedláček, Josef January 2011 (has links) The thesis deals with text mining. It describes the theory of text document clustering as well as algorithms used for clustering. This theory serves as a basis for developing an application for clustering text data. The application is developed in Java programming language and contains three methods used for clustering. The user can choose which method will be used for clustering the collection of documents. The implemented methods are K medoids, BiSec K medoids, and SOM (self-organization maps). The application also includes a validation set, which was specially created for the diploma thesis and it is used for testing the algorithms. Finally, the algorithms are compared according to obtained results.
5	A Location Routing Problem For The Municipal Solid Waste Management System Ayanoglu, Cemal Can 01 February 2007 (has links) (PDF) This study deals with a municipal solid waste management system in which the strategic and tactical decisions are addressed simultaneously. In the system, the number and locations of the transfer facilities which serve to the particular solid waste pick-up points and the landfill are determined. Additionally, routing plans are constructed for the vehicles which collect the solid waste from the pick-up points by regarding the load capacity of the vehicles and shift time restrictions. We formulate this reverse logistics system as a location-routing problem with two facility layers. Mathematical models of the problem are presented, and an iterative capacitated-k-medoids clustering-based heuristic method is proposed for the solution of the problem. Also, a sequential clustering-based heuristic method is presented as a benchmark to the iterative method. Computational studies are performed for both methods on the problem instances including up to 1000 pick-up points, 5 alternative transfer facility sites, and 25 vehicles. The results obtained show that the iterative clustering-based method developed achieves considerable improvement over the sequential clustering-based method.
6	Development and Implementation of Gene Ontology Cluster Analysis of Protein Array Data Wolting, Cheryl 05 September 2012 (has links) Decoding the genomes from organisms that encompass all taxonomies provides the foundation for extensive, large scale studies of biological molecules such as RNA, protein and carbohydrates. The high-throughput studies facilitated by the existence of these genome sequences necessitate the development of new analytic methods for the interpretation of large sets of results. The work herein focuses on the development of a novel clustering method for the analysis of protein array results and examines its utilization in the analysis of integrated interaction data sets. Sets of proteins that interact with a molecule of interest were clustered according to their functional similarity. The simUI distance metric in the statistical analysis package BioConductor was applied to measure the similarity of two proteins utilizing the assembly of their Gene Ontology annotation. Clusters were identified by partitioning around medoids and interpreted using the summary label provided by the Gene Ontology annotation of the medoid. The utility of the method was tested on two published yeast protein array data sets and shown to allow interpretation of the data to yield novel biological hypotheses. We performed a protein array screen using the E3 ubiquitin ligase and PDZ domain-containing protein LNX1. We combined these results with other published LNX1 interactors to produce a set of 220 proteins that was clustered according to Gene Ontology annotation. From the clustering results, 14 proteins were selected for subsequent examination by co-immunoprecipitation, of which 8 proteins were confirmed as LNX1 interactors. Recognition of 6 proteins by specific LNX1 PDZ domains was confirmed by fusion protein pull-downs. This work supports the role of LNX1 as a signalling scaffold. The interpretation of protein array results using our novel clustering method facilitated the identification of candidate molecules for subsequent experimental analysis. Thus our analytical method facilitates identification of biologically relevant molecules within a large data set, making this method an essential component of complex, high-throughput experimentation. molecular biology bioinformatics protein array cluster Gene Ontology partitioning around medoids protein-protein interaction cancer LNX1 PDZ 0307 0715
7	Development and Implementation of Gene Ontology Cluster Analysis of Protein Array Data Wolting, Cheryl 05 September 2012 (has links) Decoding the genomes from organisms that encompass all taxonomies provides the foundation for extensive, large scale studies of biological molecules such as RNA, protein and carbohydrates. The high-throughput studies facilitated by the existence of these genome sequences necessitate the development of new analytic methods for the interpretation of large sets of results. The work herein focuses on the development of a novel clustering method for the analysis of protein array results and examines its utilization in the analysis of integrated interaction data sets. Sets of proteins that interact with a molecule of interest were clustered according to their functional similarity. The simUI distance metric in the statistical analysis package BioConductor was applied to measure the similarity of two proteins utilizing the assembly of their Gene Ontology annotation. Clusters were identified by partitioning around medoids and interpreted using the summary label provided by the Gene Ontology annotation of the medoid. The utility of the method was tested on two published yeast protein array data sets and shown to allow interpretation of the data to yield novel biological hypotheses. We performed a protein array screen using the E3 ubiquitin ligase and PDZ domain-containing protein LNX1. We combined these results with other published LNX1 interactors to produce a set of 220 proteins that was clustered according to Gene Ontology annotation. From the clustering results, 14 proteins were selected for subsequent examination by co-immunoprecipitation, of which 8 proteins were confirmed as LNX1 interactors. Recognition of 6 proteins by specific LNX1 PDZ domains was confirmed by fusion protein pull-downs. This work supports the role of LNX1 as a signalling scaffold. The interpretation of protein array results using our novel clustering method facilitated the identification of candidate molecules for subsequent experimental analysis. Thus our analytical method facilitates identification of biologically relevant molecules within a large data set, making this method an essential component of complex, high-throughput experimentation. molecular biology bioinformatics protein array cluster Gene Ontology partitioning around medoids protein-protein interaction cancer LNX1 PDZ 0307 0715
8	Nástroj pro shlukovou analýzu / Cluster Analysis Tool Hezoučký, Ladislav January 2010 (has links) The master' s thesis deals with cluster data analysis. There are explained basic concepts and methods from this domain. Result of the thesis is Cluster analysis tool, in which are implemented methods K-Medoids and DBSCAN. Adjusted results on real data are compared with programs Rapid Miner and SAS Enterprise Miner.
9	Métodos de agrupamento na análise de dados de expressão gênica Rodrigues, Fabiene Silva 16 February 2009 (has links) Made available in DSpace on 2016-06-02T20:06:03Z (GMT). No. of bitstreams: 1 2596.pdf: 1631367 bytes, checksum: 90f2d842a935f1dd50bf587a33f6a2cb (MD5) Previous issue date: 2009-02-16 / The clustering techniques have frequently been used in literature to the analyse data in several fields of application. The main objective of this work is to study such techniques. There is a large number of clustering techniques in literature. In this work we concentrate on Self Organizing Map (SOM), k-means, k-medoids and Expectation- Maximization (EM) algorithms. These algorithms are applied to gene expression data. The analisys of gene expression, among other possibilities, identifies which genes are differently expressed in synthesis of proteins associated to normal and sick tissues. The purpose is to do a comparing of these metods, sticking out advantages and disadvantages of such. The metods were tested for simulation and after we apply them to a real data set. / As técnicas de agrupamento (clustering) vêm sendo utilizadas com freqüência na literatura para a solução de vários problemas de aplicações práticas em diversas áreas do conhecimento. O principal objetivo deste trabalho é estudar tais técnicas. Mais especificamente, estudamos os algoritmos Self Organizing Map (SOM), k-means, k-medoids, Expectation-Maximization (EM). Estes algoritmos foram aplicados a dados de expressão gênica. A análise de expressão gênica visa, entre outras possibilidades, a identificação de quais genes estão diferentemente expressos na sintetização de proteínas associados a tecidos normais e doentes. O objetivo deste trabalho é comparar estes métodos no que se refere à eficiência dos mesmos na identificação de grupos de elementos similares, ressaltando vantagens e desvantagens de cada um. Os métodos foram testados por simulação e depois aplicamos as metodologias a um conjunto de dados reais. Análise multivariada Método de agrupamento Fator de Bayes Modelos com misturas Algoritmo EM Expressão gênica Seleção de modelos Self Organizing Map (SOM) K-means K-medoids
10	An Approach To Cluster And Benchmark Regional Emergency Medical Service Agencies Kondapalli, Swetha 06 August 2020 (has links) No description available. Industrial Engineering Statistics Computer Science Emergency Medical Services Unsupervised Learning Random Forest Feature selection Clustering Benchmarking CLARANS K-means K-medoids Machine Learning Python Precision Recall Silhouette Elbow method

Search results