251 |
Contextualizing Observational Data For Modeling Human PerformanceTrinh, Viet 01 January 2009 (has links)
This research focuses on the ability to contextualize observed human behaviors in efforts to automate the process of tactical human performance modeling through learning from observations. This effort to contextualize human behavior is aimed at minimizing the role and involvement of the knowledge engineers required in building intelligent Context-based Reasoning (CxBR) agents. More specifically, the goal is to automatically discover the context in which a human actor is situated when performing a mission to facilitate the learning of such CxBR models. This research is derived from the contextualization problem left behind in Fernlund's research on using the Genetic Context Learner (GenCL) to model CxBR agents from observed human performance [Fernlund, 2004]. To accomplish the process of context discovery, this research proposes two contextualization algorithms: Contextualized Fuzzy ART (CFA) and Context Partitioning and Clustering (COPAC). The former is a more naive approach utilizing the well known Fuzzy ART strategy while the latter is a robust algorithm developed on the principles of CxBR. Using Fernlund's original five drivers, the CFA and COPAC algorithms were tested and evaluated on their ability to effectively contextualize each driver's individualized set of behaviors into well-formed and meaningful context bases as well as generating high-fidelity agents through the integration with Fernlund's GenCL algorithm. The resultant set of agents was able to capture and generalized each driver's individualized behaviors.
|
252 |
Implementation of Hierarchical and K-Means Clustering Techniques on the Trend and Seasonality Components of Temperature Profile DataOgedegbe, Emmanuel 01 December 2023 (has links) (PDF)
In this study, time series decomposition techniques are used in conjunction with Kmeans clustering and Hierarchical clustering, two well-known clustering algorithms, to climate data. Their implementation and comparisons are then examined. The main objective is to identify similar climate trends and group geographical areas with similar environmental conditions. Climate data from specific places are collected and analyzed as part of the project. The time series is then split into trend, seasonality, and residual components. In order to categorize growing regions according to their climatic inclinations, the deconstructed time series are then submitted to K-means clustering and Hierarchical clustering with dynamic time warping. In order to understand how different regions’ climates compare to one another and how regions cluster based on the general trend of the temperature profile over the course of the full growing season as opposed to the seasonality component for the various locations, the created clusters are evaluated.
|
253 |
Machine Learning implementation for Stress-DetectionMadjar, Nicole, Lindblom, Filip January 2020 (has links)
This project is about trying to apply machine learning theories on a selection of data points in order to see if an improvement of current methodology within stress detection and measure selecting could be applicable for the company Linkura AB. Linkura AB is a medical technology company based in Linköping and handles among other things stress measuring for different companies employees, as well as health coaching for selecting measures. In this report we experiment with different methods and algorithms under the collective name of Unsupervised Learning, to identify visible patterns and behaviour of data points and further on we analyze it with the quantity of data received. The methods that have been practiced on during the project are “K-means algorithm” and a dynamic hierarchical clustering algorithm. The correlation between the different data points parameters is analyzed to optimize the resource consumption, also experiments with different number of parameters are tested and discussed with an expert in stress coaching. The results stated that both algorithms can create clusters for the risk groups, however, the dynamic clustering method clearly demonstrate the optimal number of clusters that should be used. Having consulted with mentors and health coaches regarding the analysis of the produced clusters, a conclusion that the dynamic hierarchical cluster algorithm gives more accurate clusters to represent risk groups were done. The conclusion of this project is that the machine learning algorithms that have been used, can categorize data points with stress behavioral correlations, which is usable in measure testimonials. Further research should be done with a greater set of data for a more optimal result, where this project can form the basis for the implementations. / Detta projekt handlar om att försöka applicera maskininlärningsmodeller på ett urval av datapunkter för att ta reda på huruvida en förbättring av nuvarande praxis inom stressdetektering och åtgärdshantering kan vara applicerbart för företaget Linkura AB. Linkura AB är ett medicintekniskt företag baserat i Linköping och hanterar bland annat stressmätning hos andra företags anställda, samt hälso-coachning för att ta fram åtgärdspunkter för förbättring. I denna rapport experimenterar vi med olika metoder under samlingsnamnet oövervakad maskininlärning för att identifiera synbara mönster och beteenden inom datapunkter, och vidare analyseras detta i förhållande till den mängden data vi fått tillgodosett. De modeller som har använts under projektets gång har varit “K-Means algoritm” samt en dynamisk hierarkisk klustermodell. Korrelationen mellan olika datapunktsparametrar analyseras för att optimera resurshantering, samt experimentering med olika antal parametrar inkluderade i datan testas och diskuteras med expertis inom hälso-coachning. Resultaten påvisade att båda algoritmerna kan generera kluster för riskgrupper, men där den dynamiska modellen tydligt påvisar antalet kluster som ska användas för optimalt resultat. Efter konsultering med mentorer samt expertis inom hälso-coachning så drogs en slutsats om att den dynamiska modellen levererar tydligare riskkluster för att representera riskgrupper för stress. Slutsatsen för projektet blev att maskininlärningsmodeller kan kategorisera datapunkter med stressrelaterade korrelationer, vilket är användbart för åtgärdsbestämmelser. Framtida arbeten bör göras med ett större mängd data för mer optimerade resultat, där detta projekt kan ses som en grund för dessa implementeringar.
|
254 |
Identification of spatiotemporal nutrient patterns and associated ecohydrological trends in the tampa bay coastal regionWimberly, Brent 01 May 2012 (has links)
Improvements for environmental monitoring and assessment were achieved to advance our understanding of sea-land interactions and nutrient cycling in a coastal bay.; The comprehensive assessment techniques for monitoring of water quality of a coastal bay can be diversified via an extensive investigation of the spatiotemporal nutrient patterns and the associated eco-hydrological trends in a coastal urban region. With this work, it is intended to thoroughly investigate the spatiotemporal nutrient patterns and associated eco-hydrological trends via a two part inquiry of the watershed and its adjacent coastal bay. The findings show that the onset of drought lags the crest of the evapotranspiration and precipitation curve during each year of drought. During the transition year, ET and precipitation appears to start to shift back into the analogous temporal pattern as the 2005 wet year. NDVI shows a flat receding tail for the September crest in 2005 due to the hurricane impact signifying that the hurricane event in October dampening the severity of the winter dry season in which alludes to relative system memory. The k-means model with 8 clusters is the optimal choice, in which cluster 2 at Lower Tampa Bay had the minimum values of total nitrogen (TN) concentrations, chlorophyll a (Chl-a) concentrations, and ocean color values in every season as well as the minimum concentration of total phosphorus (TP) in three consecutive seasons in 2008. Cluster 5, located in Middle Tampa Bay, displayed elevated TN concentrations, ocean color values, and Chl-a concentrations, suggesting that high colored dissolved organic matter values are linked with some nutrient sources. The data presented by the gravity modeling analysis indicate that the Alafia River Basin is the major contributor of nutrients in terms of both TP and TN values in all seasons. Such ecohydrological evaluation can be applied for supporting the LULC management of climatic vulnerable regions as well as further enrich the comprehensive assessment techniques for estimating and examining the multi-temporal impacts and dynamic influence of urban land use and land cover.
|
255 |
K-Centers Dynamic Clustering Algorithms and ApplicationsXie, Qing Yan January 2013 (has links)
No description available.
|
256 |
Development of novel unsupervised and supervised informatics methods for drug discovery applicationsMohiddin, Syed B. 22 February 2006 (has links)
No description available.
|
257 |
Paletto: An Interactive Colour Palette Generator : Facilitating Designers’ Colour Selection ProcessesSalman, Rema January 2022 (has links)
Digital growth and the adaption of internet-based solutions, particularly artificial intelligence and machine learning, have dramatically changed the way design is done today. This rapid change in technology has challenged the level of automation, which influences the human-automation interactions with the available colour-design tools (academic and commercial). As colour design and selection are known to be one of the most critical steps of any art or design journey, the currently available tools use one over the other approaches, from the automation-levels spectrum, when it comes to contextual search for colour palettes, colour-extracting, and colour compatibility. On the one hand, fully automated approaches could exclude the designers’ intervention; on the other hand, fully manual approaches could be affected by human errors and weaknesses. Both approaches tend to have problems when used in colour design tools, such as restricting the designers’ freedom, overwhelming designers with information-overload and option-widget clutter that exist in the interfaces of such tools, or limiting designers by the functionalities offered by the tool based on its purpose, causing it to partially support certain parts of the designers’ colour selection process rather than the whole process. The thesis focuses on investigating the possible solutions for balancing the automated and manual methods for generating colour palettes and supporting the designers’ non-standardised colour-selection processes while tailoring the solution to intellectually stimulate and engage designers who work in different design fields, in comparison with the Adobe Explore Page–which is one of the most well-known and established colour design tools in today’s market and one of the applications that offers a contextual search feature. To fulfil the purpose of this research, a web-based application was prototyped (named Paletto), which consists of the requirements for enabling the rapid generation and exploration of colour palette variations, supporting end-users to contextually search for palettes, and allowing users to apply constraints (via a preference selection list) for a holistic palette adjustment. Afterwards, the proposed application was evaluated with 20 individuals from the target audience, using both qualitative and quantitative approaches to prove the concept according to participants’ acceptance, estimate Paletto’s effectiveness on their workflow and design process, examine their engagement and experience when completing the exploratory tasks, and gather additional insights about the design or the conceptual design and implementation of the application. Paletto generally received positive responses towards (1) the accuracy and relevance of its search results, (2) the selection feature and its adaptability and flexibility for human interventions, and (3) the system’s feedback in terms of information accessibility (e.g., search word and number of pages in the pagination). However, the palette generation feature had partially negative responses where participants showed annoyance, confusion, and thought it was complicated. At the same time, several participants appreciated the diversity of the generated palettes and the conceptual design of Paletto in general. Paletto found to effectively facilitate the colour-selection process and designers’ workloads in several areas, such as: fulfilling the end-user goals of producing quality palettes to be used in design projects; resources-efficiency (e.g., money-preserving, effort facilitation, and time-saving) for inspirational image gathering; automatic colour extraction and palette generation; providing freedom and support of decision making to explore colour combinations and variations via the iterative preferences selection; supporting colour-pattern identification in the selections; providing variation and relevant results when searching inspirational image gathering with accurate colour extractions that represent the searched images. Moreover, Paletto proved to offer greater user engagement and a better user experience in comparison with Adobe’s Explore Page. This was due to the felt involvement and the continuous interactivity offered by Paletto’s search and preference-selection features that allowed iterative palette generation and modification. In conclusion, the evaluations indicated some pain-points and gaps in the current design that were discussed in this thesis, and are accordingly recommended to be investigated in future work.
|
258 |
Daily pattern recognition of dynamic origin-destination matrices using clustering and kernel principal component analysis / Daglig mönsterigenkänning av dynamiska Origin-Destination-matriser med hjälp av clustering och kernel principal component analysisDong, Zhiwu January 2021 (has links)
Origin-Destination (OD) matrix plays an important role in traffic management and urban planning. However, the OD estimation demands large data collection which has been done in past mostly by surveys with numerous limitations. With the development of communication technology and artificial intelligence technology, the transportation industry experiences new opportunities and challenges. Sensors bring big data characterized by 4V (Volume, Variety, Velocity, Value) to the transportation domain. This allows traffic practitioners to receive data covering large-scale areas and long time periods, even several years of data. At the same time, the introduction of artificial intelligence technology provides new opportunities and challenges in processing massive data. Advances from computer science have also brought revolutionary advancements in the field of transportation. All these new advances and technologies enable large data collection that can be used for extracting and estimating dynamic OD matrices for small time intervals and long time periods.Using Stockholm as the focus of the case study, this thesis estimates dynamic OD matrices covering data collected from the tolls located around Stockholm municipality. These dynamic OD matrices are used to analyze the day-to-day characteristics of the traffic flow that goes through Stockholm. In other words, the typical day-types of traffic through the city center are identified and studied in this work. This study analyzes the data collected by 58 sensors around Stockholm containing nearly 100 million vehicle observations (12GB).Furthermore, we consider and study the effects of dimensionality reduction on the revealing of most common day-types by clustering. The considered dimensionality reduction techniques are Principal Component Analysis (PCA) and its variant Kernel PCA (KPCA). The results reveal that dimensionality reduction significantly drops computational costs while resulting in reasonable day-types. Day-type clusters reveal expected as unexpected patterns and thus could have potential in traffic management, urban planning, and designing the strategy for congestion tax. / Origin-Destination (OD) -matrisen spelar en viktig roll i trafikledning och stadsplanering. Emellertid kräver OD-uppskattningen stor datainsamling, vilket har gjorts tidigare mest genom enkäter med många begränsningar. Med utvecklingen av kommunikationsteknik och artificiell intelligens upplever transportindustrin nya möjligheter och utmaningar. Sensorer ger stor data som kännetecknas av 4V (på engelska, volym, variation, hastighet, värde) till transportdomänen. Detta gör det möjligt för trafikutövare att ta emot data som täcker storskaliga områden och långa tidsperioder, till och med flera års data. Samtidigt ger introduktionen av artificiell intelligens teknik nya möjligheter och utmaningar i behandlingen av massiva data. Datavetenskapens framsteg har också lett till revolutionära framsteg inom transportområdet. Alla dessa nya framsteg och tekniker möjliggör stor datainsamling som kan användas för att extrahera och uppskatta dynamiska OD-matriser under små tidsintervall och långa tidsperioder.Genom att använda Stockholm som fokus för fallstudien uppskattar denna avhandling dynamiska OD-matriser som täcker data som samlats in från vägtullarna runt Stockholms kommun. Dessa dynamiska OD-matriser används för att analysera de dagliga egenskaperna hos trafikflödet i Stockholm genom stadens centrum. Med andra ord känns igen och studeras de typiska dagtyperna av trafik genom stadens centrum i detta arbete. Denna studie analyserar data som samlats in av 58 sensorer runt Stockholm som innehåller nästan 100 miljoner fordonsobservationer (12 GB)Dessutom överväger och studerar vi effekterna av dimensioneringsreduktion på avslöjandet av de vanligaste dagtyperna genom kluster. De betraktade dimensioneringsreduktionsteknikerna är Principal Component Analysis (PCA) och dess variant Kernel PCA (KPCA). Resultaten avslöjar att dimensioneringsreduktion avsevärt minskar beräkningskostnaderna, samtidigt som det ger rimliga dagtyper. Dagstyp kluster avslöjar förväntade som oväntade mönster och därmed kan ha potential i trafikledning, stadsplanering och utformning av strategin för trängselskatt.
|
259 |
Epigenetic Responses of Arabidopsis to Abiotic StressLaliberte, Suzanne Rae 17 March 2023 (has links)
Weed resistance to control measures, particularly herbicides, is a growing problem in agriculture. In the case of herbicides, resistance is sometimes connected to genetic changes that directly affect the target site of the herbicide. Other cases are less straightforward where resistance arises without such a clear-cut mechanism. Understanding the genetic and gene regulatory mechanisms that may lead to the rapid evolution of resistance in weedy species is critical to securing our food supply. To study this phenomenon, we exposed young Arabidopsis plants to sublethal levels of one of four weed management stressors, glyphosate herbicide, trifloxysulfuron herbicide, mechanical clipping, and shading. To evaluate responses to these stressors we collected data on gene expression and regulation via epigenetic modification (methylation) and small RNA (sRNA). For all of the treatments except shade, the stress was limited in duration, and the plants were allowed to recover until flowering, to identify changes that persist to reproduction. At flowering, DNA for methylation bisulfite sequencing, RNA, and sRNA were extracted from newly formed rosette leaf tissue. Analyzing the individual datasets revealed many differential responses when compared to the untreated control for gene expression, methylation, and sRNA expression. All three measures showed increases in differential abundance that were unique to each stressor, with very little overlap between stressors. Herbicide treatments tended to exhibit the largest number of significant differential responses, with glyphosate treatment most often associated with the greatest differences and contributing to overlap. To evaluate how large datasets from methylation, gene expression, and sRNA analyses could be connected and mined to link regulatory information with changes in gene expression, the information from each dataset and for each gene was united in a single large matrix and mined with classification algorithms. Although our models were able to differentiate patterns in a set of simulated data, the raw datasets were too noisy for the models to consistently identify differentially expressed genes. However, by focusing on responses at a local level, we identified several genes with differential expression, differential sRNA, and differential methylation. While further studies will be needed to determine whether these epigenetic changes truly influence gene expression at these sites, the changes detected at the treatment level could prime the plants for future incidents of stress, including herbicides. / Doctor of Philosophy / Growing resistance to herbicides, particularly glyphosate, is one of the many problems facing agriculture. The rapid rise of resistance across herbicide classes has caused some to wonder if there is a mechanism of adaptation that does not involve mutations. Epigenetics is the study of changes in the phenotype that cannot be attributed to changes in the genotype. Typically, studies revolve around two features of the chromosomes: cytosine methylation and histone modifications. The former can influence how proteins interact with DNA, and the latter can influence protein access to DNA. Both can affect each other in self-reinforcing loops. They can affect gene expression, and DNA methylation can be directed by small RNA (sRNA), which can also influence gene expression through other pathways. To study these processes and their role in abiotic stress response, we aimed to analyze sRNA, RNA, and DNA from Arabidopsis thaliana plants under stress. The stresses applied were sublethal doses of the herbicides, glyphosate and trifloxysulfuron, as well as mechanical clipping and shade to represent other weed management stressors. The focus of the project was to analyze these responses individually and together to find epigenetic responses to stresses routinely encountered by weeds. We tested RNA for gene expression changes under our stress conditions and identified many, including some pertaining to DNA methylation regulation. The herbicide treatments were associated with upregulated defense genes and downregulated growth genes. Shade treated plants had many downregulated defense and other stress response genes. We also detected differential methylation and sRNA responses when compared to the control plants. Changes to methylation and sRNA only accounted for about 20% of the variation in gene expression. While attempting to link the epigenetic process of methylation to gene expression, we connected all the data sets and developed computer programs to try to make correlations. While these methods worked on a simulated dataset, we did not detect broad patterns of changes to epigenetic pathways that correlated strongly with gene expression in our experiment's data. There are many factors that can influence gene expression that could create noise that would hinder the algorithms' abilities to detect differentially expressed genes. This does not, however, rule out the possibility of epigenetic influence on gene expression in local contexts. Through scoring the traits of individual genes, we found several that interest us for future studies.
|
260 |
Comparison of initialization methods of K-means clustering for small dataTabibzadeh, Liam January 2022 (has links)
Clustering of observations into groups arises as a fundamental challenge both in academia and industry. Many clustering algorithms exist, and the most widely used clustering algorithm, the K-means, notably suffers from sensitivity to initial allocation of cluster centers. Moreover, many heuristics and algorithms have been developed to find the best initial allocation, and this experimental study compares methods of initialization by measuring how well the initialization methods perform on simulated, small datasets, through various performance criterion. The results show that using the output clusters of a Hierarchical clustering is the best initialization method. Moreover, the most popular methods, Random partitioning and KMeans++, perform poorly. Although the experimental setup may favour some initialization methods over others, the applied researchers are recommended to perform a Hierarchical clustering as an initialization of the K-means algorithm.
|
Page generated in 0.043 seconds