Global ETD Search

831	Shoppin’ in the Rain : An Evaluation of the Usefulness of Weather-Based Features for an ML Ranking Model in the Setting of Children’s Clothing Online Retailing / Handla i regnet : En utvärdering av användbarheten av väderbaserade variabler för en ML-rankningsmodell inom onlineförsäljning av barnkläder Lorentz, Isac January 2023 (has links) Online shopping offers numerous benefits, but large product catalogs make it difficult for shoppers to understand the existence and characteristics of every item for sale. To simplify the decision-making process, online retailers use ranking models to recommend products relevant to each individual user. Contextual user data, such as location, time, or local weather conditions, can serve as valuable features for ranking models, enabling personalized real-time recommendations. Little research has been published on the usefulness of weather-based features for ranking models in online clothing retailing, which makes additional research into this topic worthwhile. Using Swedish sales and customer data from Babyshop, an online retailer of children’s fashion, this study examined possible correlations between local weather data and sales. This was done by comparing differences in daily weather and differences in daily shares of sold items per clothing category for two cities: Stockholm and Göteborg. With Malmö as an additional city, historical observational weather data from one location each in the three cities Stockholm, Göteborg, and Malmö was then featurized and used along with the customers’ postal towns, sales features, and sales trend features to train and evaluate the ranking relevancy of a gradient boosted decision trees learning to rank LightGBM ranking model with weather features. The ranking relevancy was compared against a LightGBM baseline that omitted the weather features and a naive baseline: a popularity-based ranker. Several possible correlations between a clothing category such as shorts, rainwear, shell jackets, winter wear, and a weather variable such as feels-like temperature, solar energy, wind speed, precipitation, snow, and snow depth were found. Evaluation of the ranking relevancy was done using the mean reciprocal rank and the mean average precision @ 10 on a small dataset consisting only of customer data from the postal towns Stockholm, Göteborg, and Malmö and also on a larger dataset where customers in postal towns from larger geographical areas had their home locations approximated as Stockholm, Göteborg or Malmö. The LightGBM rankers beat the naive baseline in three out of four configurations, and the ranker with weather features outperformed the LightGBM baseline by 1.1 to 2.2 percent across all configurations. The findings can potentially help online clothing retailers create more relevant product recommendations. / Internethandel erbjuder flera fördelar, men stora produktsortiment gör det svårt för konsumenter att känna till existensen av och egenskaperna hos alla produkter som saluförs. För att förenkla beslutsprocessen så använder internethandlare rankningsmodeller för att rekommendera relevanta produkter till varje enskild användare. Kontextuell användardata såsom tid på dygnet, användarens plats eller lokalt väder kan vara värdefulla variabler för rankningsmodeller då det möjliggör personaliserade realtidsrekommendationer. Det finns inte mycket publicerad forskning inom nyttan av väderbaserade variabler för produktrekommendationssystem inom internethandel av kläder, vilket gör ytterligare studier inom detta område intressant. Med hjälp av svensk försäljnings- och kunddata från Babyshop, en internethandel för barnkläder så undersökte denna studie möjliga korrelationer mellan lokal väderdata och försäljning. Detta gjordes genom att jämföra skillnaderna i dagligt väder och skillnaderna i dagliga andelar av sålda artiklar per klädeskategori för två städer: Stockholm och Göteborg. Med Malmö som ytterligare en stad så gjordes historiska metereologiska observationer från en plats var i Stockholm, Göteborg och Malmö till variabler och användes tillsammans med kundernas postorter, försäljningsvariabler och variabler för försäljningstrender för att träna och utvärdera rankningsrelevansen hos en gradient-boosted decision trees learning to rank LightGBM rankningsmodell med vädervariabler. Rankningsrelevansen jämfördes mot en LightGBM baslinjesmodel som saknade vädervariabler samt en naiv baslinje: en popularitetsbaserad rankningsmodell. Flera möjliga korrelationer mellan en klädeskategori som shorts, regnkläder, skaljackor, vinterkläder och och en daglig vädervariabel som känns-som-temperatur, solenergi, vindhastighet, nederbörd, snö och snödjup upptäcktes. Utvärderingen av rankingsrelevansen utfördes med mean reciprocal rank och mean average precision @ 10 på ett mindre dataset som bestod endast av kunddata från postorterna Stockholm, Göteborg och Malmö och även på ett större dataset där kunder med postorter från större geografiska områden fick sina hemorter approximerade som Stockholm, Göteborg eller Malmö. LigthGBM-rankningsmodellerna slog den naiva baslinjen i tre av fyra konfigurationer och rankningsmodellen med vädervariabler slog LightGBM baslinjen med 1.1 till 2.2 procent i alla konfigurationer. Resultaten kan potentiellt hjälpa internethandlare inom mode att skapa bättre produktrekommendationssystem. Statistical analysis regression analysis recommender systems ensemble learning electronic commerce LightGBM learning to rank feature selection weather-based features fashion Statistisk analys regressionsanalys rekommendationssystem ensemble-inlärning näthandel LightGBM learning to rank variabelselektion väderbaserade variabler mode Computer and Information Sciences Data- och informationsvetenskap
832	Finding Causal Relationships Among Metrics In A Cloud-Native Environment / Att hitta orsakssamband bland Mätvärden i ett moln-native Miljö Rishi Nandan, Suresh January 2023 (has links) Automatic Root Cause Analysis (RCA) systems aim to streamline the process of identifying the underlying cause of software failures in complex cloud-native environments. These systems employ graph-like structures to represent causal relationships between different components of a software application. These relationships are typically learned through performance and resource utilization metrics of the microservices in the system. To accomplish this objective, numerous RCA systems utilize statistical algorithms, specifically those falling under the category of causal discovery. These algorithms have demonstrated their utility not only in RCA systems but also in a wide range of other domains and applications. Nonetheless, there exists a research gap in the exploration of the feasibility and efficacy of multivariate time series causal discovery algorithms for deriving causal graphs within a microservice framework. By harnessing metric time series data from Prometheus and applying these algorithms, we aim to shed light on their performance in a cloudnative environment. Furthermore, we have introduced an adaptation in the form of an ensemble causal discovery algorithm. Our experimentation with this ensemble approach, conducted on datasets with known causal relationships, unequivocally demonstrates its potential in enhancing the precision of detected causal connections. Notably, our ultimate objective was to ascertain reliable causal relationships within Ericsson’s cloud-native system ’X,’ where the ground truth is unavailable. The ensemble causal discovery approach triumphs over the limitations of employing individual causal discovery algorithms, significantly augmenting confidence in the unveiled causal relationships. As a practical illustration of the utility of the ensemble causal discovery techniques, we have delved into the domain of anomaly detection. By leveraging causal graphs within our study, we have successfully applied this technique to anomaly detection within the Ericsson system. / System för automatisk rotorsaksanalys (RCA) syftar till att effektivisera process för att identifiera den underliggande orsaken till programvarufel i komplexa molnbaserade miljöer. Dessa system använder grafliknande strukturer att representera orsakssamband mellan olika komponenter i en mjukvaruapplikation. Dessa relationer lär man sig vanligtvis genom prestanda och resursutnyttjande mätvärden för mikrotjänsterna i systemet. För att uppnå detta mål använder många RCAsystem statistiska algoritmer, särskilt de som faller under kategorin orsaksupptäckt. Dessa algoritmer har visat att de inte är användbara endast i RCA-system men även inom en lång rad andra domäner och applikationer. Icke desto mindre finns det en forskningslucka i utforskningen av genomförbarhet och effektivitet av orsaksupptäckt av multivariat tidsserie algoritmer för att härleda kausala grafer inom ett mikrotjänstramverk. Genom att utnyttja metriska tidsseriedata från Prometheus och tillämpa Dessa algoritmer strävar vi efter att belysa deras prestanda i ett moln- inhemsk miljö. Dessutom har vi infört en anpassning i formen av en ensemble kausal upptäcktsalgoritm. Vårt experiment med denna ensemblemetod, utförd på datauppsättningar med kända orsakssamband relationer, visar otvetydigt sin potential för att förbättra precisionen hos upptäckta orsakssamband. Särskilt vår ultimata Målet var att fastställa tillförlitliga orsakssamband inom Ericssons molnbaserade systemet ’X’, där grundsanningen inte är tillgänglig. De ensemble kausal discovery approach segrar över begränsningarna av att använda individuella kausala upptäcktsalgoritmer, avsevärt öka förtroendet för de avslöjade orsakssambanden. Som en praktisk illustration av nyttan av ensemblens kausal upptäcktstekniker har vi fördjupat oss i anomalidomänen upptäckt. Genom att utnyttja kausala grafer inom vår studie har vi framgångsrikt tillämpat denna teknik för att detektera anomali inom Ericsson system Causality Causal Discovery Bayesian Network Conditional Independence Partial Correlation Ensemble Causal Discovery Anomaly Detection Causal Graphs Causality Causal Discovery Bayesian Network Conditional Indeberoende partiell korrelation Ensemble Causal Discovery Anomali Detektion kausala grafer Computer and Information Sciences Data- och informationsvetenskap
833	Masks Nicholas, Jeffrey Francis 22 April 2016 (has links) No description available. Music Jeffrey Nicholas Masks contemporary music Pierrot ensemble BGSU multiphonics new music Mikel Kuehn Christopher Dietz thesis Gerard Grisey Michel van der Aa Sebastian Currier Kaija Saariaho ensemble 319 Joshua Marquez EMU Pinckney, MI
834	Vivre ensemble au quotidien : expérience urbaine des autochtones et des non-autochtones à l’ère du vivir bien à El Alto et La Paz en Bolivie Paquet, Marie-Ève 26 August 2020 (has links) Titre de l'écran-titre (visionné le 21 août 2020) / En Bolivie, l’élection du président autochtone Evo Morales en 2005 et la réforme constitutionnelle de 2009, intégrant le concept ancestral du vivir bien ont fait couler beaucoup d’encre ces dernières années. Alors que la plupart des ouvrages se penchent principalement sur l’apport théorique du vivir bien, ce mémoire cherche à enrichir la compréhension de ce concept dans sa dimension pratique et locale. Ce mémoire porte plus particulièrement sur l’expérience urbaine des autochtones, principalement des Aymaras, et des non-autochtones dans leur quête du bien-vivre à La Paz et à El Alto, en Bolivie. L’analyse se penche principalement sur la négociation des identités en mettant en lumière les différentes dimensions, tant politiques, économiques, sociales, culturelles qu’artistiques, du quotidien des autochtones, mais aussi des non-autochtones. En particulier, ce mémoire explore les stratégies d’affirmation mises en avant pour se sentir bien, notamment la création de réseaux, le maintien de pratiques rituelles et la participation à diverses manifestations culturelles et artistiques dont l’entrada folclórica universitaria, un festival folklorique auxquels prennent part les étudiants. -- Mots-clés : vivir bien, identité, authenticité, anthropologie urbaine, fête, danse, culture, LaPaz, El Alto, Bolivie. / In Bolivia, the election of indigenous president Evo Morales in 2005 and the constitutional reform of 2009, incorporating the ancestral concept of living well have been the subject of much attention in recent years. While most books primarily focus on the theoretical contribution of the living well concept, this thesis seeks to enrich the understanding of its practical and local dimensions. This thesis examines the urban experience of Indigenous people, mainly Aymaras, and non-Indigenous people in their pursuit of living well in La Paz and El Alto, in Bolivia. The analysis focuses on the negotiation of identities by highlighting the different dimensions, both political, economic, social, cultural and artistic, of the everyday lives of Indigenous people, but also of non-Indigenous people. This thesis more specifically explores the affirmation strategies put forward to feel good, including the creation of networks, the preservation of ritual practices and the participation in various cultural and artistic activities including the entrada folclórica universitaria, a university festival in which students partake. -- Keywords: Aymaras, living well, identity, authenticity, urban anthropology, fiesta, dance,culture, La Paz, El Alto, Bolivia. / En Bolivia, la elección del presidente indígena Evo Morales en 2005 y la reformaconstitucional de 2009, que incorpora el concepto ancestral del vivir bien, han sido objeto de mucha atención en los últimos años. Si bien la mayoría de los libros se enfocan en la contribución teórica del vivir bien, esta tesis busca enriquecer la comprensión de este concepto en su dimensión práctica y local. Esta tesis se centra en la experiencia urbana de los indígenas, principalmente los Aymaras y de los no indígenas en su búsqueda del vivir bien en La Paz y El Alto, en Bolivia. El análisis se enfoca en la negociación de identidades, destacando las diferentes dimensiones, tanto políticas, económicas, sociales, culturales y artísticas, de la vida cotidiana de los indígenas, como también de los no indígenas. En particular, esta tesis explora las estrategias de afirmación presentadas para sentirse bien, incluyendo la creación de redes, el mantenimiento de prácticas rituales y la participación en diversos eventos culturales y artísticos, como la entrada folclórica universitaria, un festival universitario al que participan varios estudiantes. -- Palabras claves: Aymaras, vivir bien, identidad, autenticidad, antropología urbana, fiesta, baile, cultura, La Paz, El Alto, Bolivia. Vivre-ensemble -- Bolivie -- La Paz. Vivre-ensemble -- Bolivie -- El Alto. Aymara (Indiens) -- Bolivie. Multiculturalisme -- Bolivie. Anthropologie urbaine -- Bolivie.
835	Exploring contributions to opera by The Black Tie Ensemble : a historical case study / Antoinette Johanna Olivier Olivier, Antoinette Johanna January 2014 (has links) This dissertation explores the contribution to opera in South Africa by The Black Tie Ensemble. The research follows a qualitative research design. It is a historical case study which is conducted against an interpretivist philosophical perspective. Data were collected through interviews conducted with prominent role-players in The Black Tie Ensemble and through various articles from newspapers and magazines. From the data collected, specific themes crystallized; the impact of performance and training opportunities flourished during the twelve years of the existence of this unique programme, the development of singers and sponsorship to the arts contributed significantly to the success or failure of this phenomenon and outreach programmes introduced the genre to the broader community. Recommendations from this study could lead towards the planning and guidance of sponsorships for similar programmes in the future and indicate the need for more training facilities of young singers throughout the country, whilst gaining performance experience in a theatre. Such training and experience could ensure a future career in singing and hence job creation. / MA (Musicology), North-West University, Potchefstroom Campus, 2015 The Black Tie Ensemble South African opera Outreach programme Opera development Historical case study Sponsorship Job creation Performance
836	Expert team theory and goal oriented rehearsal strategies for a new music ensemble : a case study / Pieter Andreas Oosthuizen Oosthuizen, Pieter Andreas January 2014 (has links) The purpose of this intrinsic case study was to show how Expert Team Theory can explain the application of goal orientated rehearsal strategies which were designed for this study for an ad hoc ensemble at the School of Music of the North-West University, Potchefstroom, South Africa. The case study was considered as the most suitable research method to investigate the ways in which goal-orientated rehearsal strategies influence dynamics during rehearsals of a new music ensemble, and the experiences by the members of their interaction, because this approach allowed me to investigate these strategies in a real world environment. This study was born out of an interest in rehearsal strategies and in different ways to structure music rehearsals. The characteristics of a new music ensemble determined the use of Expert Team Theory as the theoretical basis for the design of the goalorientated rehearsal strategies. These characteristics correspond well with that of an expert team as “a set of interdependent team members, each of whom possesses unique and expert-level knowledge, skills, and experience related to task performance, and who adapt, coordinate, and cooperate as a team, thereby producing sustainable and repeatable team functioning at superior or at least nearoptimal levels of performance” (Salas et al., 2006:439-440). Based on interviews with the participants and the observations of video recordings of the rehearsals, the results show that interpreting the data through the theoretical lens of Expert Team Theory enabled me to explain the rehearsal process as a dynamic confluence of experiences created through the interaction of the ensemble members who grew through increasing cooperation and coordination to resemble an expert team. Their sense of collectiveness and their trust coupled with strong leadership allowed the success of the strategy of prebrief-performance-debrief. The ensemble developed progressively clearer shared mental models and understandings of roles and responsibilities. A clear, valued and shared vision helped them to manage and optimize performance outcomes. The findings are also interrogated in terms of cooperative learning to further explain the web-like way in which different themes developed. This led to a discussion of the limitations of this study and suggestions for further research. / MA (Performance), North-West University, Potchefstroom Campus, 2015 New music Ensemble Rehearsal strategies Expert teams Team adaptability and decision-making Shared cognition Team leadership Collective efficacy Cooperative learning
837	Exploring contributions to opera by The Black Tie Ensemble : a historical case study / Antoinette Johanna Olivier Olivier, Antoinette Johanna January 2014 (has links) This dissertation explores the contribution to opera in South Africa by The Black Tie Ensemble. The research follows a qualitative research design. It is a historical case study which is conducted against an interpretivist philosophical perspective. Data were collected through interviews conducted with prominent role-players in The Black Tie Ensemble and through various articles from newspapers and magazines. From the data collected, specific themes crystallized; the impact of performance and training opportunities flourished during the twelve years of the existence of this unique programme, the development of singers and sponsorship to the arts contributed significantly to the success or failure of this phenomenon and outreach programmes introduced the genre to the broader community. Recommendations from this study could lead towards the planning and guidance of sponsorships for similar programmes in the future and indicate the need for more training facilities of young singers throughout the country, whilst gaining performance experience in a theatre. Such training and experience could ensure a future career in singing and hence job creation. / MA (Musicology), North-West University, Potchefstroom Campus, 2015 The Black Tie Ensemble South African opera Outreach programme Opera development Historical case study Sponsorship Job creation Performance
838	Expert team theory and goal oriented rehearsal strategies for a new music ensemble : a case study / Pieter Andreas Oosthuizen Oosthuizen, Pieter Andreas January 2014 (has links) The purpose of this intrinsic case study was to show how Expert Team Theory can explain the application of goal orientated rehearsal strategies which were designed for this study for an ad hoc ensemble at the School of Music of the North-West University, Potchefstroom, South Africa. The case study was considered as the most suitable research method to investigate the ways in which goal-orientated rehearsal strategies influence dynamics during rehearsals of a new music ensemble, and the experiences by the members of their interaction, because this approach allowed me to investigate these strategies in a real world environment. This study was born out of an interest in rehearsal strategies and in different ways to structure music rehearsals. The characteristics of a new music ensemble determined the use of Expert Team Theory as the theoretical basis for the design of the goalorientated rehearsal strategies. These characteristics correspond well with that of an expert team as “a set of interdependent team members, each of whom possesses unique and expert-level knowledge, skills, and experience related to task performance, and who adapt, coordinate, and cooperate as a team, thereby producing sustainable and repeatable team functioning at superior or at least nearoptimal levels of performance” (Salas et al., 2006:439-440). Based on interviews with the participants and the observations of video recordings of the rehearsals, the results show that interpreting the data through the theoretical lens of Expert Team Theory enabled me to explain the rehearsal process as a dynamic confluence of experiences created through the interaction of the ensemble members who grew through increasing cooperation and coordination to resemble an expert team. Their sense of collectiveness and their trust coupled with strong leadership allowed the success of the strategy of prebrief-performance-debrief. The ensemble developed progressively clearer shared mental models and understandings of roles and responsibilities. A clear, valued and shared vision helped them to manage and optimize performance outcomes. The findings are also interrogated in terms of cooperative learning to further explain the web-like way in which different themes developed. This led to a discussion of the limitations of this study and suggestions for further research. / MA (Performance), North-West University, Potchefstroom Campus, 2015 New music Ensemble Rehearsal strategies Expert teams Team adaptability and decision-making Shared cognition Team leadership Collective efficacy Cooperative learning
839	<em>SYMPHONY FOR WIND ORCHESTRA</em> BY LUIS SERRANO ALARCÓN: BACKGROUND, ANALYSIS, AND CONDUCTOR’S GUIDE Goodwin, Donald F. 01 January 2016 (has links) Born in 1972, Luis Serrano Alarcón has in a very short period of time, established himself as one of Spain’s most prominent composers. His works are constantly being performed, not only in his home country, but throughout the world. While some of his compositions tend to retain the rhythmic, harmonic, and melodic style typical to Spanish music, many of the works sound as if they were borne more from the Viennese symphonic tradition, both during the time of Haydn and Beethoven, but also during the time of Arnold Schoenberg. As a young boy Alarcón took up piano lessons with a local teacher by the name of Javier Barranco. Through him, Alarcón learned “the music for piano of the great masters of Classicism, Romanticism, and Spanish Nationalism.” In addition he began to study with two other teachers: Jose Cervera Collado and Jose Maria Cervara Lloret. With Collado, Alarcón studied conducting, and with Lloret he studied harmony. As a result of all of this training, Alarcón was drawn toward the symphonic music of the Classical and Romantic periods, especially gravitating toward the music of Beethoven and Brahms. Alarcón’s compositional style has maintained a chameleon-like flexibility as he is able to change styles from one composition to the next with litheness and grace, showing a strong grasp of American jazz as well as flamenco music of his native country in Duende, capturing the sounds of tango from Argentina in Concertango, and of course, in the many examples of his paso dobles. Unlike many of his contemporaries, though, Alarcón’s unique voice seems to emerge through any style he is embracing or any combination of instruments in his orchestration. In terms of style, Symphony for Wind Orchestra (2012) is an entirely different type of composition. It is immediately apparent from the opening tutti strikes, that (like Mozart and many other traditional composers before and after), Alarcón is embracing a iii traditional symphonic style in this composition by utilizing one of its most common symphonic topos. Symphony for Wind Orchestra is an amazing study of the Classical symphony from its earliest beginnings in Mannheim, to its codification at the hands of Haydn, Mozart, and Beethoven, and to its explosion in size and scope at the end of the nineteenth and early twentieth century with composers like Brahms, Bruckner, and Mahler. Perhaps more important, though, is his choice of harmonic language and compositional approach. The work is decidedly based upon thematic material that is reminiscent of the Second Viennese School; atonal at times, semi-tonal at others, but consistently manipulated through the operations (transposition, inversion, retrograde, verticalization, and serialization), that were made popular by Arnold Schoenberg, his students, and those who followed them. The genesis of this composition was a consortium of band directors from the Southeastern Conference Band Association, led initially by Tom Verrier, who is Senior Band Conductor and Director of Wind Ensembles at Vanderbilt University. Dr. John Cody Birdwell was a part of the consortium from its onset, but didn’t initially plan on conducting the premiere at his school (the University of Kentucky). Birdwell stated,“...the opportunity to premiere the work sort of ‘landed in our lap.’ I had heard some of Alarcón’s other compositions in recent years, and I knew that this piece was going to be fantastic, so we moved forward without any hesitation.” Clearly with so much positive feedback regarding the work, this document is certainly justified. The goals of this study are to provide some background for the work and its composer, to analyze the work while providing examples of all of its main themes and important figures, and where appropriate, to show how they relate to each other. This document will also create a helpful performance guide for conductors, which should facilitate and contribute to many more performances of this significant work in the future. Along with the harmonic and thematic analysis of the work, this document will also include interviews with the composer, the conductor of the premiere of the work (Dr. John Cody Birdwell), one of the early and staunch supporters of Alarcón’s works (Dr. Tim Reynish), and Javier Enguidanos Morató - another Spanish conductor who recently performed the work. Band Symphony for Wind Orchestra Luis Serrano Alarcon Donald Goodwin Wind Ensemble Composition Composition Music Education Music Theory Other Music
840	Σχεδιασμός, υλοποίηση και εφαρμογή μεθόδων υπολογιστικής νοημοσύνης για την πρόβλεψη παθογόνων μονονουκλεοτιδικών πολυμορφισμών Ραπακούλια, Τρισεύγενη 11 October 2013 (has links) Η πιο απλή μορφή γενετικής διαφοροποίησης στον άνθρωπο είναι οι μονονουκλεοτιδικοί πολυμορφισμοί (Single Nucleotide Polymorphisms - SNPs). Ο αριθμός αυτού του είδους πολυμορφισμών που έχουν βρεθεί στο ανθρώπινο γονιδίωμα και επηρεάζουν την παραγόμενη πρωτεΐνη αυξάνεται συνεχώς, αλλά η αντιστοίχηση τους σε πιθανές ασθένειες με πειραματικές μεθόδους είναι ασύμφορη από θέμα χρόνου και κόστους. Για αυτό τον λόγο έχουν αναπτυχθεί διάφορες υπολογιστικές μέθοδοι με σκοπό να ταξινομήσουν τους μονονουκλεοτιδικούς πολυμορφισμούς σε παθογόνους και μη. Οι περισσότερες από αυτές τις μεθόδους χρησιμοποιούν ταξινομητές, οι οποίοι παίρνοντας σαν είσοδο ένα σύνολο δομικών, λειτουργικών, ακολουθιακών και εξελικτικών χαρακτηριστικών, επιχειρούν να προβλέψουν αν ένας μονονουκλεοτιδικός πολυμορφισμός είναι παθογόνος ή μη. Για την εκπαίδευση αυτών των ταξινομητών, χρησιμοποιούνται δύο σύνολα μονονουκλεοτιδικών πολυμορφισμών. Το πρώτο αποτελείται από μονονουκλεοτιδικούς πολυμορφισμούς που έχει βρεθεί πειραματικά ότι οδηγούν σε παθογένεια και το δεύτερο από μονονουκλεοτιδικούς πολυμορφισμούς που έχει αποδειχθεί πειραματικά ότι είναι αδρανείς. Οι μέθοδοι αυτές διαφέρουν στα χαρακτηριστικά των μεταλλάξεων που λαμβάνουν υπόψη στην πρόβλεψη τους, καθώς επίσης και στην εκπαίδευση και τη φύση των τεχνικών ταξινόμησης, που χρησιμοποιούν για τη λήψη των αποφάσεων. Το βασικότερο προβλήματα τους ωστόσο έγκειται στο γεγονός ότι καθορίζουν τα χαρακτηριστικά, που θα χρησιμοποιήσουν σαν είσοδο στους ταξινομητές τους με τρόπο εμπειρικό και μάλιστα διαφορετικές μέθοδοι προτείνουν και χρησιμοποιούν διαφορετικά χαρακτηριστικά, χωρίς να τεκμηριώνουν επαρκώς τις αιτίες αυτής της διαφοροποίησης. Δύο ακόμα προβλήματα που δεν έχουν καταφέρει να αντιμετωπίσουν οι υπάρχουσες μεθοδολογίες είναι το πρόβλημα της ανισορροπίας των δύο κλάσεων ταξινόμησης και των ελλιπών τιμών σε πολλά από τα χαρακτηριστικά εισόδου των ταξινομητών, ώστε να επιτυγχάνουν πιο ακριβή και αξιόπιστα αποτελέσματα. Από τα παραπάνω είναι ξεκάθαρο πως υπάρχει μεγάλο περιθώριο βελτίωσης των υπάρχουσων μεθοδολογιών για το συγκεκριμένο πρόβλημα ταξινόμησης. Στην παρούσα διπλωματική εργασία προτείνουμε μια νέα υβριδική μεθοδολογία υπολογιστικής νοημοσύνης, που ξεπερνά πολλά από τα προβλήματα των υπάρχοντων μεθοδολογιών και βελτιώνει με τον τρόπο αυτό την απόδοσή τους. Δύο είναι τα βασικά βήματα που ακολουθήσαμε για την επίτευξη του στόχου αυτού. Πρώτον, συγκεντρώσαμε από τις διαθέσιμες δημόσιες βάσεις δεδομένων, τους μονονουκλεοτιδικούς πολυμορφισμούς που χρησιμοποιήθηκαν για την εκπαίδευση και τον έλεγχο των μοντέλων μηχανικής μάθησης. Συγκεκριμένα, συλλέχθησαν και φιλτραρίστηκαν τα θετικά και αρνητικά σύνολα εκπαίδευσης και ελέγχου, που αποτελούνται από μονονουκλεοτιδικούς πολυμορφισμούς που είτε οδηγούν σε παθογένεια, είτε είναι ουδέτεροι. Για κάθε πολυμορφισμό των δύο συνόλων υπολογίσαμε χρησιμοποιώντας υπάρχοντα διαθέσιμα εργαλεία όσο το δυνατό περισσότερα δομικά, λειτουργικά, ακολουθιακά και εξελικτικά χαρακτηριστικά. Για εκείνα τα χαρακτηριστικά, για τα οποία δεν υπήρχε κάποιο διαθέσιμο εργαλείο υπολογισμού τους, υλοποιήσαμε τον κατάλληλο κώδικα για τον υπολογισμό τους. Το δεύτερο βήμα της διπλωματικής αφορούσε το σχεδιασμό και την υλοποίηση της κατάλληλης υβριδικής μεθόδου για την επίλυση του προβλήματος που μελετάμε. Χρησιμοποιήσαμε μια νέα μέθοδο ταξινόμησης την EnsembleGASVR. Πρόκειται για μια ensemble μεθοδολογία, που συνδυάζει σε ένα ενιαίο πλαίσιο ταξινόμησης οκτώ διαφορετικούς ταξινομητές. Κάθε ένας από αυτούς τους ταξινομητές βασίζεται στον υβριδικό συνδυασμό των Γενετικών Αλγορίθμων και των μοντέλων Παλινδρόμησης Διανυσμάτων Υποστήριξης (nu-Support Vector Regression). Συγκεκριμένα ένας Προσαρμοζόμενος Γενετικός Αλγόριθμος χρησιμοποιείται για να καθοριστεί το βέλτιστο υποσύνολο χαρακτηριστικών, καθώς και οι βέλτιστες τιμές των παραμέτρων των ταξινομητών. Σαν μέθοδο ταξινόμησης των μεταλλάξεων σε ουδέτερες και παθογενείς, προτείνουμε τον nu-SVR ταξινομητή, καθώς παρουσιάζει υψηλή απόδοση, καλή γενίκευση, δεν παγιδεύεται σε τοπικά βέλτιστα, ενώ ταυτόχρονα επιτυγχάνει την ισορροπία μεταξύ της ακρίβειας και της πολυπλοκότητας του μοντέλου. Μάλιστα για να ξεπεράσουμε τα πρόβληματα των ελλιπών τιμών και της ανισορροπίας των δύο κλάσεων ταξινόμησης, αλλά και για να βελτιώσουμε τη συνολική απόδοση της μεθοδολογίας μας, επεκτείναμε τον υβριδικό αλγόριθμο, ώστε να λειτουργεί σαν μία ensemble-συλλογική τεχνική, συνδυάζοντας οκτώ επί μέρους μοντέλα ταξινόμησης. Τα πειραματικά αποτελέσματα της προτεινόμενης μεθοδολογίας ήταν εξαιρετικά ελπιδοφόρα, καθώς η EnsembleGASVR μεθοδολογία υπερτερεί σημαντικά έναντι άλλων ευρέως γνωστών μεθόδων ταξινόμησης παθογενών μεταλλάξεων. / Single Nucleotide Polymorphisms (SNPs) are the most common form of genetic variations in humans. The number of SNPs that have been found in human genome and affect protein functionality is constantly increasing. Finding matches between SNPs and diseases using experimental techniques, is excessive disadvantageous in terms of time and cost. For this reason, several computational methods have been developed. These methods classify polymorphisms as pathogenic and non-pathogenic. Most of them use classifiers, which take as input a set of structural, functional, sequential and evolutionary features and predict whether a single nucleotide polymorphism is pathogenic or neutral. For training these classifiers use two sets of SNPs. The first one consists of SNPs that have been experimentally proven as pathogenic, whereas the second set consists of SNPs that have been experimentally characterized as benign. These methods differ in the classification methods they deploy and in the features they use as inputs. However, the main problem is the determination of an empirically verified set of features for training. Specifically, different methods suggest different feature sets, without adequately documenting the causes of this differentiation. In addition, the existing methodologies do not tackle efficiently the class imbalance problem between positive and negative training sets and the problem of missing values in the datasets. In this thesis a new hybrid computational intelligence methodology is proposed, that overcomes many of the problems of existing methodologies. The proposed method achieves high classification performance and systematizes the selection of relevant features. In the first phase of this study the polymorphisms were gathered from the available public databases and they were used for training and testing of the machine learning models. Specifically, the positive and negative training and test sets were collected and filtered. They consist of single nucleotide polymorphisms that lead to either pathogenesis or are neutral. For each polymorphism of the two sets, using existing available tools, a wide range of structural, functional, sequential and evolutionary features were calculated. For those features for which there was no available tool, the suitable program (code) was developed in order to compute them. In the second step a new embedded hybrid classification method called EnsembleGASVR is designed and implemented. The method uses an ensemble methodology, based on hybrid combination of Genetic Algorithms and nu-Support Vector Regression (nu-SVR) models. An Adaptive Genetic Algorithm is used to determine the optimal subset of features and the optimal values of the parameters of classifiers. We propose the nu-SVR classifier, since it exhibits high performance, good generalization ability, it is not trapped in local optima and achieves a balance between accuracy and complexity of the model. In order to overcome the problem of missing values and class imbalance, we extended the above algorithm to function as a collective ensemble-technique, combining eight individual classification models. In overall, the method achieves 87.45% accuracy, 71.78% sensitivity and 93.16% specificity. These priliminary results are very promising and shows that EnsembleGASVR methodology significantly outperforms other well-known classification methods for pathogenic mutations. Μηχανική μάθηση Γενετικοί αλγόριθμοι 616.042 Pathogenic mutations Single Nucleotide Polymorphisms (SNPs) Ensemble methods Support vector regression

Search results