Global ETD Search

191	AI-assisted analysis of ICT-centre cooling : Using K-means clustering to identify cooling patterns in water-cooled ICT rooms Wallin, Oliver, Jigsved, Johan January 2023 (has links) Information and communications technology (ICT) is an important part in today’s society and around 60% of the world's population are connected to the internet. Processing and storing ICT data corresponds to approximately 1% of the global electricity demand. Locations that store ICT data produce a lot of heat that needs to be cooled, and the cooling systems stand for up to 40% of the total energy used in ICT-centre locations. Investigating the efficiency of the cooling in ICT-centres is important to make the whole ICT-centre more energy efficient, and possibly saving operational costs. Unwanted operational behaviour in the cooling system can be analysed by using unsupervised machine learning and clustering of data. The purpose of this thesis is to characterise cooling patterns, using K-means clustering, in two water-cooled ICT rooms. The rooms are located at Ericsson’s facilities in Linköping Sweden. This will be fulfilled answering the research questions: RQ1. What is the cooling power per m2 delivered by the cooling equipment in the two different ICT rooms at Ericsson? RQ2. What operational patterns can be found using a suitable clustering algorithm to process and compare data for LCP at two ICT-rooms? RQ3. Based on information from RQ1 and patterns from RQ2 what undesired operational behaviours can be identified for the cooling system? The K-means clustering is applied to time series data collected during the year of 2022 which include temperatures of water and air; electric power and cooling power; as well as waterflow in the system. The two rooms use Liquid Cooling Packages (LCP)s, also known as in-row cooling units, and room 1 (R1) also include computer room air handlers (CRAHs). K-means clusters each observation into a group that share characteristics and represent different operating scenarios. The elbow-method is used to determine the number of clusters, it created four clusters for R1 and three clusters for room 2 (R2). Results show that the operational patterns differ between R1 and R2. The cooling power produced per m2 is 1.36 kW/m2 for R1 and 2.14 kW/m2 for R2. Cooling power per m3 is 0.39 kW/m3 for R1 and 0.61 kW/m3 for R2. Undesirable operational behaviours were identified through clustering and visual representation of the data. Some LCPs operate very differently even when sharing the same hot aisle. There are disturbances such as air flow and setpoints that create these differences, which results in that some LCPs operate with high cooling power and others that operate with low cooling power. The cluster with the highest cooling power is cluster 4 and 3 for R1 and R2 respectively. Cluster 2 has the lowest cooling power in R1 and R2. For LCPs operating in cluster 2 where waterflow mostly at 0 l/min and therefore where not contributing to the cooling of the rooms. Lastly, the supplied electrical power and produced cooling power match in R1 but do not in R2. Implying that heat leave the rooms by other means than via the cooling system or faulty measurements. There is a possibility to investigate this further. Water in R1 and R2 is found to, at occasions, exit the room with temperature below the ambient room temperature. It is also concluded that the method functions to identify unwanted operational behaviours, knowledge that can be used to improve ICT operations. To summarize, undesired operational behaviours can be identified using the unsupervised machine learning technique K-means clustering. AI K-means clustering ICT-centre data centre data center ICT-center cooling data center cooling Energy Engineering Energiteknik
192	Estimating eco-friendly driving behavior in various traffic situations, using machine learning / Estimering av miljövänligt körbeteende i olika traffiksituationer, med maskininlärning Fors, Ludvig January 2023 (has links) This thesis investigates how various driver signals, signals that a truck driver can interact with, influences fuel consumption and what are the optimal values of these signals in various traffic conditions. More specifically, the objective is to estimate good driver behavior in various traffic conditions and compare bad driver behavior in similar situations to see how performing a specific driver action, changing a driver signal from the bad driver value to the corresponding good driver value impacts the fuel consumption. The result is an AI-based algorithm that utilizes the transformer model architecture to estimate good driver behavior, based on environmental describing signals, as well as fuel consumption. Utilizing these, causal inference is used to estimate how much fuel can be saved by switching a driver signal from a bad driver value to a good driver value. Machine learning transformers neural networks casual inference K-Means driver behavior fuel consumption Computer Sciences Datavetenskap (datalogi)
193	Computational Methods for Solving Next Generation Sequencing Challenges Aldwairi, Tamer Ali 13 December 2014 (has links) In this study we build solutions to three common challenges in the fields of bioinformatics through utilizing statistical methods and developing computational approaches. First, we address a common problem in genome wide association studies, which is linking genotype features within organisms of the same species to their phenotype characteristics. We specifically studied FHA domain genes in Arabidopsis thaliana distributed within Eurasian regions by clustering those plants that share similar genotype characteristics and comparing that to the regions from which they were taken. Second, we also developed a tool for calculating transposable element density within different regions of a genome. The tool is built to utilize the information provided by other transposable element annotation tools and to provide the user with a number of options for calculating the density for various genomic elements such as genes, piRNA and miRNA or for the whole genome. It also provides a detailed calculation of densities for each family and subamily of the transposable elements. Finally, we address the problem of mapping multi reads in the genome and their effects on gene expression. To accomplish this, we implemented methods to determine the statistical significance of expression values within the genes utilizing both a unique and multi-read weighting scheme. We believe this approach provides a much more accurate measure of gene expression than existing methods such as discarding multi reads completely or assigning them randomly to a set of best assignments, while also providing a better estimation of the proper mapping locations of ambiguous reads. Overall, the solutions we built in these studies provide researchers with tools and approaches that aid in solving some of the common challenges that arise in the analysis of high throughput sequence data. clustering gene expression next generation sequencing RNA-Seq transposable element piRNA SNP K-means genome wide association studies statistical methods
194	The development and analysis of a computationally efficient data driven suit jacket fit recommendation system Bogdanov, Daniil January 2017 (has links) In this master thesis work we design and analyze a data driven suit jacket ﬁt recommendation system which aim to guide shoppers in the process of assessing garment ﬁt over the web. The system is divided into two stages. In the ﬁrst stage we analyze labelled customer data, train supervised learning models as to be able to predict optimal suit jacket dimensions of unseen shoppers and determine appropriate models for each suit jacket dimension. In stage two the recommendation system uses the results from stage one and sorts a garment collection from best ﬁt to least ﬁt. The sorted collection is what the ﬁt recommendation system is to return. In this thesis work we propose a particular design of stage two that aim to reduce the complexity of the system but at a cost of reduced quality of the results. The trade-oﬀs are identiﬁed and weighed against each other. The results in stage one show that simple supervised learning models with linear regression functions suﬃce when the independent and dependent variables align at particular landmarks on the body. If style preferences are also to be incorporated into the supervised learning models, non-linear regression functions should be considered as to account for increased complexity. The results in stage two show that the complexity of the recommendation system can be made independent from the complexity of how ﬁt is assessed. And as technology is enabling for more advanced ways of assessing garment ﬁt, such as 3D body scanning techniques, the proposed design of reducing the complexity of the recommendation system enables for highly complex techniques to be utilized without aﬀecting the responsiveness of the system in run-time. / I detta masterexamensarbete designar och analyserar vi ett datadrivet rekommendationssystem för kavajer med mål att vägleda nät-handlare i deras process i att bedöma passform över internet. Systemet är uppdelat i två steg. I det första steget analyserar vi märkt data och tränar modeller i att lära sig att framställa prognoser av optimala kavajmått för shoppare som inte systemet har tidigare exponeras för. I steg två tar rekommendationssystemet resultatet ifrån steg ett och sorterar plaggkollektionen från bästa till sämsta passform. Den sorterade kollektionen är vad systemet är tänkt att retunera. I detta arbete föreslåar vi en speciﬁk utformning gällande steg två med mål att reducera komplexiteten av systemet men till en kostnad i noggrannhet vad det gäller resultat. För- och nackdelar identiﬁeras och vägs mot varandra. Resultatet i steg två visar att enkla modeller med linjära regressionsfunktioner räcker när de obereoende och beroende variabler sammanfaller på speciﬁka punkter på kroppen. Om stil-preferenser också vill inkorpereras i dessa modeller bör icke-linjära regressionsfunktioner betraktas för att redogöra för den ökade komplexitet som medföljer. Resultaten i steg två visar att komplexiteten av rekommendationssystemet kan göras obereoende av komplexiteten för hur passform bedöms. Och då teknologin möjliggör för allt mer avancerade sätt att bedöma passform, såsom 3D-scannings tekniker, kan mer komplexa tekniker utnyttjas utan att påverka responstiden för systemet under körtid. garment fit machine learning clustering k-means linear regression support vector regression e-commerce online garment shopping Computer Sciences Datavetenskap (datalogi)
195	Identification, investigation and prediction of post-COVID phenotypes : Using Cluster analysis and Ordinal logistic regression to determine severity of post-COVID Malmquist, Sara, Rykatkin, Oliver January 2023 (has links) It is believed that a large number of people experience remaining symptoms after COVID-19, so-called post-COVID. The formal definition and diagnostic criteria of post-COVID have been a scientific controversy. So far, there is no reliable system for distinguishing the severity of post-COVID. This type of measurement would be helpful in future targeted therapies. Therefore, this thesis aims to evaluate the relationship between an individual’s functional status today and the symptoms present as well as identify relevant groups of post-COVID based on these 17 long-term symptoms of post-COVID. Further, to produce a model for which of these groups an individual belongs to. By using cluster analysis and ordinal logistic regression, Post-COVID Syndrome scores are produced. That is based upon both subjects who were hospitalised and those who were not, collected through a project called COMBAT post-covid. The individuals are then divided into groups based on these scores, and a prediction model is made using ordinal logistic regression and backward deletion. Three well-separated groups of post-COVID are found based on the produced scores. The prediction model indicates that the nine variables Sex, BMI, Smoking, Snuff, Heart disease, Lung disease, Diabetes, Chronic pain and Symptom severity at the onset seem important for predicting someone’s group. This study showed that the remaining symptoms affected an individual’s functional status, including self-reported working ability and general health. K-means Ordinal logistic regression Backwards deletion PCS score Post-COVID ROC curve Probability Theory and Statistics Sannolikhetsteori och statistik
196	Genetic Variations and Physiological Mechanisms Underlying Photosynthetic Capacity in Soybean (Glycine max (L.) Merrill) / ダイズの光合成能力の遺伝変異とその生理的機構に関する研究 SHAMIM, MOHAMMAD JAN 26 September 2022 (has links) 京都大学 / 新制・課程博士 / 博士(農学) / 甲第24240号 / 農博第2519号 / 新制\|\|農\|\|1094(附属図書館) / 学位論文\|\|R4\|\|N5411(農学部図書室) / 京都大学大学院農学研究科農学専攻 / (主査)教授白岩立彦, 教授土井元章, 教授那須田周平 / 学位規則第4条第1項該当 / Doctor of Agricultural Science / Kyoto University / DFAM soybean and Glycine tomentella Hayata Gas Exchange Soybean core collection GWAS RNA Expression K-Means Clustering Rubisco and nitrogen content 610
197	Klusteranalys : Tillämpning av agglomerativ hierarkisk och k-means klustring för att hitta bra kluster bland fotbollsspelare baserat på spelarstatistik. Balbas, Sacko, Törnquist, Arvid January 2024 (has links) This work is about how the multivariate analysis tool cluster analysis can be appliedto find meaningfull groups of players based on player statistics. The aim of the work isan attempt to find good clusters among players within the Spanish top football divisionLa Liga for the 2022-2023 season. A comparison between agglomerative hierarchical and k-means has been applied as a method to answer the purpose. The result of the workshowed that no good clusters could be identified among the players based on playerstatistics from La Liga season 22-23. Cluster analysis hierarchical clustering k-means clustering La Liga football algorithm machine learning. Probability Theory and Statistics Sannolikhetsteori och statistik
198	Analysis of Transactional Data with Long Short-Term Memory Recurrent Neural Networks Nawaz, Sabeen January 2020 (has links) An issue authorities and banks face is fraud related to payments and transactions where huge monetary losses occur to a party or where money laundering schemes are carried out. Previous work in the field of machine learning for fraud detection has addressed the issue as a supervised learning problem. In this thesis, we propose a model which can be used in a fraud detection system with transactions and payments that are unlabeled. The proposed modelis a Long Short-term Memory in an auto-encoder decoder network (LSTMAED)which is trained and tested on transformed data. The data is transformed by reducing it to Principal Components and clustering it with K-means. The model is trained to reconstruct the sequence with high accuracy. Our results indicate that the LSTM-AED performs better than a random sequence generating process in learning and reconstructing a sequence of payments. We also found that huge a loss of information occurs in the pre-processing stages. / Obehöriga transaktioner och bedrägerier i betalningar kan leda till stora ekonomiska förluster för banker och myndigheter. Inom maskininlärning har detta problem tidigare hanterats med hjälp av klassifierare via supervised learning. I detta examensarbete föreslår vi en modell som kan användas i ett system för att upptäcka bedrägerier. Modellen appliceras på omärkt data med många olika variabler. Modellen som används är en Long Short-term memory i en auto-encoder decoder nätverk. Datan transformeras med PCA och klustras med K-means. Modellen tränas till att rekonstruera en sekvens av betalningar med hög noggrannhet. Vår resultat visar att LSTM-AED presterar bättre än en modell som endast gissar nästa punkt i sekvensen. Resultatet visar också att mycket information i datan går förlorad när den förbehandlas och transformeras. LSTM Auto-encoder decoder anomaly detection K-means clustering Principal Component Analysis Computer and Information Sciences Data- och informationsvetenskap
199	An Unsupervised Machine-Learning Framework for Behavioral Classification from Animal-Borne Accelerometers Dentinger, Jane Elizabeth 03 May 2019 (has links) Studies of animal spatial distributions typically use prior knowledge of animal habitat requirements and behavioral ecology to deduce the most likely explanations of observed habitat use. Animal-borne accelerometers can be used to distinguish behaviors which allows us to incorporate in situ behavior into our understanding of spatial distributions. Past research has focused on using supervised machine-learning, which requires a priori specification of behavior to identify signals whereas unsupervised approaches allow the model to identify as many signal types as permitted by the data. The following framework couples direct observation to behavioral clusters identified from unsupervised machine learning on a large accelerometry dataset. A behavioral profile was constructed to describe the proportion of behaviors observed per cluster and the framework was applied to an acceleration dataset collected from wild pigs (Sus scrofa). Although, most clusters represented combinations of behaviors, a leave-p-out validation procedure indicated this classification system accurately predicted new data. artificial neural networks behavioral classification wild pigs machine learning k-means clustering remote sensing self-organizing maps
200	Computational Intelligence and Data Mining Techniques Using the Fire Data Set Storer, Jeremy J. 04 May 2016 (has links) No description available. Computer Science Fire Dataset Machine Learning Computational Intelligence Data Mining Neural Networks Particle Swarm Optimization k-Means Clustering Spectral Clustering

Search results