Global ETD Search

211	Σχεδιασμός παθητικών αρμονικών φίλτρων Ασιλιάν, Ορέστης 15 April 2013 (has links) Η ραγδαία ανάπτυξη των ηλεκτρονικών ισχύος τα τελευταία χρόνια και η χρήση τους σε μια ποικιλία βιομηχανικών εφαρμογών οδήγησε στο πρόβλημα των ανώτερων αρμονικών. Η ύπαρξη ανώτερων αρμονικών επιφέρει πολλά προβλήματα στα συστήματα ηλεκτρικής ενέργειας και στα στοιχεία που τα αποτελούν. Για το λόγο αυτό, τοποθετούνται παθητικά φίλτρα που σκοπό έχουν να μειώσουν έως και να εξαλείψουν τις συνέπειες των αρμονικών. Ο σχεδιασμός των παθητικών αρμονικών φίλτρων μπορεί να γίνει με βάση διάφορα κριτήρια. Στην παρούσα διπλωματική εργασία, γίνεται σχεδιασμός παθητικών αρμονικών φίλτρων με τη βοήθεια των γενετικών αλγορίθμων που σκοπό έχουν τη μεγιστοποίηση του συντελεστή ισχύος. Η μεγιστοποίηση του συντελεστή ισχύος γίνεται αυξάνοντας την ωφέλιμη ισχύ (την ισχύ της θεμελιώδους αρμονικής), μειώνοντας τη μιγαδική ισχύ και εξαλείφοντας την ισχύ των ανώτερων αρμονικών (απώλειες). Πιο συγκεκριμένα: Στο 1ο κεφάλαιο γίνεται μια εισαγωγή στο πρόβλημα των ανώτερων αρμονικών καθώς και στις επιπτώσεις τους σε διάφορα τμήματα του εξοπλισμού ενός συστήματος ηλεκτρικής ενέργειας. Επίσης αναφέρεται συνοπτικά ο σκοπός των αρμονικών φίλτρων και της διπλωματικής εργασίας. Στο 2ο κεφάλαιο γίνεται εκτενής αναφορά στα παθητικά αρμονικά φίλτρα και σε θέματα που σχετίζονται με τα είδη, το μέγεθος, το κόστος, το σχεδιασμό και την προστασία των φίλτρων. Επιπλέον γίνεται μνεία στα όρια των αρμονικών βάσει του προτύπου 519-1992 του ΙΕΕΕ. Το 3ο κεφάλαιο γράφτηκε με σκοπό να αποτελέσει ένα εγχειρίδιο γενετικών αλγορίθμων. Ξεκινά με την ιστορική αναδρομή και γενικές έννοιες και επεκτείνεται σε ένα ευρύ φάσμα εφαρμογών. Επιπλέον γίνεται επεξήγηση των δυνατοτήτων του GA toolbox του Matlab καθώς και τα βήματα που θα πρέπει να ακολουθήσει κάποιος προκειμένου να χρησιμοποιήσει τους γενετικούς αλγορίθμους ως εργαλείο βελτιστοποίησης. 2 Στο 4ο κεφάλαιο γίνεται αναλυτική περιγραφή του συστήματος και των δεδομένων του, αποδεικνύεται το μαθηματικό μοντέλο που χρησιμοποιήσαμε και γίνεται αναφορά στα μεγέθη που θα υπολογιστούν. Επιπροσθέτως, παρουσιάζονται τα αποτελέσματα που προέκυψαν από τους γενετικούς αλγορίθμους για την περίπτωση των αρμονικών πηγής τάσης και για αυτή των αρμονικών πηγής ρεύματος έτσι ώστε να μεγιστοποιείται ο συντελεστής ισχύος. Τέλος, στο 5ο κεφάλαιο εξάγονται χρήσιμα συμπεράσματα τα οποία απορρέουν από το σχεδιασμό των παθητικών αρμονικών φίλτρων με τη χρήση των γενετικών αλγορίθμων έτσι ώστε να μεγιστοποιείται ο συντελεστής ισχύος. Εν κατακλείδι, αναφέρονται κάποιες μελλοντικές σκέψεις σχετικά με το σχεδιασμό παθητικών αρμονικών φίλτρων. / The rapid development of power electronics in recent years and their use in a variety of industrial applications led to the problem of higher harmonics. The existence of higher harmonics causes many problems in power electric systems and the elements they consist of. This is the reason why passive filters which are designed in order to reduce or even eliminate the effects of higher harmonics are used. The design of passive filters is based on various criteria. This Thesis, examines the design of passive harmonic filters (using genetic algorithms), which aim to maximize power factor. By increasing the useful power (power of fundamental harmonic), decreasing the apparent power and eliminating the power of higher harmonics (losses), the power factor is being maximized. More specific: The first chapter is an introduction to the problem of higher harmonics, as well as to their effects on various parts of electric power system equipment. There is also a summary of the purpose of harmonic filters and this thesis. In the second chapter extensive references are made on passive harmonic filters and issues relating to species, size, cost, design and protection of filters. In addition, reference is made to harmonic bounds, based on IEEE Standard 519-1992. The third chapter is written in order to serve as a handbook of genetic algorithms. It enters with the historical background of genetic algorithms and extends to their use on a wide range of applications. The potentials of Genetic Algorithm (GA) toolbox of Matlab are explained. The steps to be followed in order to use genetic algorithms as optimization tool are also mentioned. A detailed description of the system and its data is given in the fourth chapter. The mathematical model and its relations used to simulate our system are proved. An extensive reference is made to the units sizes that will be calculated. Additionally, the results obtained from the genetic algorithms both in the case of the harmonic voltage source and the harmonic current source, in order to maximize the power factor, are presented. Finally, in the fifth chapter, useful conclusions are drawn from the design of passive harmonic filters (using genetic algorithms). In addition some thoughts about alternative designs of passive harmonic filters are stated in this chapter. Γενετικοί αλγόριθμοι 621.319 12 Genetic algorithms Passive harmonic filters Maximization of power factor
212	Exploiting Non-Sequence Data in Dynamic Model Learning Huang, Tzu-Kuo 01 October 2013 (has links) Virtually all methods of learning dynamic models from data start from the same basic assumption: that the learning algorithm will be provided with a single or multiple sequences of data generated from the dynamic model. However, in quite a few modern time series modeling tasks, the collection of reliable time series data turns out to be a major challenge, due to either slow progression of the dynamic process of interest, or inaccessibility of repetitive measurements of the same dynamic process over time. In most of those situations, however, we observe that it is easier to collect a large amount of non-sequence samples, or random snapshots of the dynamic process of interest without time information. This thesis aims to exploit such non-sequence data in learning a few widely used dynamic models, including fully observable, linear and nonlinear models as well as Hidden Markov Models (HMMs). For fully observable models, we point out several issues on model identifiability when learning from non-sequence data, and develop EM-type learning algorithms based on maximizing approximate likelihood. We also consider the setting where a small amount of sequence data are available in addition to non-sequence data, and propose a novel penalized least square approach that uses non-sequence data to regularize the model. For HMMs, we draw inspiration from recent advances in spectral learning of latent variable models and propose spectral algorithms that provably recover the model parameters, under reasonable assumptions on the generative process of non-sequence data and the true model. To the best of our knowledge, this is the first formal guarantee on learning dynamic models from non-sequence data. We also consider the case where little sequence data are available, and propose learning algorithms that, as in the fully observable case, use non-sequence data to provide regularization, but does so in combination with spectral methods. Experiments on synthetic data and several real data sets, including gene expression and cell image time series, demonstrate the effectiveness of our proposed methods. In the last part of the thesis we return to the usual setting of learning from sequence data, and consider learning bi-clustered vector auto-regressive models, whose transition matrix is both sparse, revealing significant interactions among variables, and bi-clustered, identifying groups of variables that have similar interactions with other variables. Such structures may aid other learning tasks in the same domain that have abundant non-sequence data by providing better regularization in our proposed non-sequence methods. Dynamic Model Vector Auto-regression Hidden Markov Mode l Latent Vari- able Model Expectation Maximization Spectral Learning
213	Analyse et contrôle optimal d'un bioréacteur de digestion anaérobie / Optimal control and analysis of anaerobic digestion bioreactor Ghouali, Amel 14 December 2015 (has links) Cette thèse porte sur l'analyse et le contrôle optimal d'un digesteur anaérobie. L'objectif est de proposer une stratégie de commande optimale pour maximiser la quantité de biogaz produit dans un bioréacteur anaérobie sur une période de temps donnée. Plus particulièrement, à partir d'un modèle simple de bioprocédé et en considérant une classe importante de cinétiques de croissance, nous résolvons un problème de maximisation de biogaz produit par le système pendant un temps fixé, en utilisant le taux de dilution D(.) comme variable de contrôle. Dépendant des conditions initiales du système, l'analyse du problème de contrôle optimal fait apparaître des degrés de difficulté très divers. Dans une première partie, nous résolvons le problème dans le cas où le taux de dilution permettant de maximiser le débit de gaz à l'équilibre est à l'intérieur des bornes minimales et maximales du débit d'alimentation pouvant être appliqué au système : il s'agit du cas WDAC (Well Dimensioned Actuator Case). La synthèse du contrôle optimal est obtenue par une approche de comparaison de trajectoires d'un système dynamique. Une étude comparative des solutions exactes avec celle obtenues avec une approche numérique directe en utilisant le logiciel "BOCOP" est faite. Une comparaison des performances du contrôleur optimal avec celles obtenues en appliquant une loi heuristique est discutée. On montre en particulier que les deux lois de commande amènent le système vers le même point optimal. Dans une deuxième partie, dans le cas où l'actionneur est sous- (ou sur-) dimensionné, c'est-à-dire si la valeur du taux de dilution à appliquer pour obtenir le maximum de biogaz à l'équilibre est en dehors de la valeur minimale ou maximale de l'actionneur, alors nous définissons les cas UDAC (Uder dimensioned Actuator Case) et ODAC (Over Dimensioned Actuator Case) que nous résolvons en appliquant le principe du maximum de Pontryagin. / This thesis focuses on the optimal control of an anaerobic digestor for maximizing its biogas production. In particular, using a simple model of the anaerobic digestion process, we derive a control law to maximize the biogas production over a period of time using the dilution rate D(.) as the control variable. Depending on initial conditions and constraints on the actuator, the search for a solution to the optimal control problem reveals very different levels of difficulty. In the first part, we consider that there are no severe constraints on the actuator. In particular, the interval in which the input flow rate lives includes the value which allows the biogas to be maximized at equilibrium. For this case, named WDAC (Well Dimensioned Actuator Case) we solve the optimal control problem using classical tools of differential equations analysis.Numerical simulations illustrate the robustness of the control law with respect to several parameters, notably with respect to initial conditions. We use these results to show that an heuristic control law proposed in the literature is optimal in a certain sense. The optimal trajectories are then compared with those given by a purely numerical optimal control solver (i.e. the "BOCOP" toolkit) which is an open-source toolbox for solving optimal control problems. When the exact analytical solution to the optimal control problem cannot be found, we suggest that such numerical tool can be used to intuiter optimal solutions.In the second part, the problem of maximizing the biogas production is treated when the actuator is under (-over) dimensioned. These are the cases UDAC (Under Dimensioned Actuator Cases) and ODAC (Over Dimensioned Actuator Cases). Then we solve these optimal problems using the Maximum Principle of Pontryagin. Bioréacteur Digestion anaérobie Contrôle optimal Biogaz Maximisation Taux de dilution Bioreactor Anaerobic digestion Optimal control Biogas Maximization Dilution rate
214	Heuristic methods for solving two discrete optimization problems Cabezas García, José Xavier January 2018 (has links) In this thesis we study two discrete optimization problems: Traffic Light Synchronization and Location with Customers Orderings. A widely used approach to solve the synchronization of traffic lights on transport networks is the maximization of the time during which cars start at one end of a street and can go to the other without stopping for a red light (bandwidth maximization). The mixed integer linear model found in the literature, named MAXBAND, can be solved by optimization solvers only for small instances. In this manuscript we review in detail all the constraints of the original linear model, including those that describe all the cyclic routes in the graph, and we generalize some bounds for integer variables which so far had been presented only for problems that do not consider cycles. Furthermore, we summarized the first systematic algorithm to solve a simpler version of the problem on a single street. We also propose a solution algorithm that uses Tabu Search and Variable Neighbourhood Search and we carry out a computational study. In addition we propose a linear formulation for the shortest path problem with traffic lights constraints (SPTL). On the other hand, the simple plant location problem with order (SPLPO) is a variant of the simple plant location problem (SPLP) where the customers have preferences on the facilities which will serve them. In particular, customers define their preferences by ranking each of the potential facilities. Even though the SPLP has been widely studied in the literature, the SPLPO has been studied much less and the size of the instances that can be solved is very limited. In this manuscript, we propose a heuristic that uses a Lagrangean relaxation output as a starting point of a semi-Lagrangean relaxation algorithm to find good feasible solutions (often the optimal solution). We also carry out a computational study to illustrate the good performance of our method. Last, we introduce the partial and stochastic versions of SPLPO and apply the Lagrangean algorithm proposed for the deterministic case to then show examples and results.
215	OPTIMAL GROUP SIZE IN HUMANS: AN EXPERIMENTAL TEST OF THE SIMPLE PER CAPITA MAXIMIZATION MODEL Klotz, Jared Lee 01 December 2016 (has links) The current study utilized two experiments to assess Smith's (1981) simple per capita-maximization model, which provides a quantitative framework for predicting optimal group sizes in social foraging contexts. Participants engaged in a social foraging task where they chose to forage for points exchangeable for lottery prizes either alone or in a group that has agreed to pool and share all resources equally. In Experiment 1, groups (“settlements”) of 10 or 12 participants made repeated group membership choices. Settlements were exposed to three conditions in which the optimal group size was either 2, 5, or 2 for the 10 person settlement or 3, 4, or 6 for the 12 person settlement. A linear regression of the data from Experiment 1 revealed a strong relationship between the observed group sizes and group sizes predicted by the simple per capita maximization model. Experiment 2 was a systematic replication of Experiment 1 in which single participants foraged for shared resources with groups of automated players in a computerized simulation. Automated player group choices mirrored group choices of participants in Experiment 1; excluding the data for the best performing participant. Thus, the participant acted essentially in the stead of the best performing participant for each condition. Two logistic regressions provided mixed support for the model, while failing to replicate the results of Experiment 1, providing mixed support for the use of the simple per capita maximization model in predicting group sizes in social foraging contexts. Central Place Foraging Optimal Foraging Optimal Group Size Per Capita Maximization Sharing & Collective Action Social Foraging
216	Maximização da utilidade esperada, planejamento tributário e governança corporativa / Maximization of the agent\'s expected utility, tax avoidance, and corporate governance Alexandre José Negrini de Mattos 28 June 2017 (has links) Esta pesquisa examinou se a tomada de decisão dos agentes considera os custos e benefícios do planejamento tributário e se boas práticas de governança corporativa reduzem o engajamento dos gestores na prática de planejamento tributário. Adicionalmente, investigouse a relação entre utilidade esperada/valor esperado do planejamento tributário e o endividamento das empresas. Para mensurar se a prática do planejamento tributário tem relação com a maximização da utilidade esperada do agente (maximização dos benefícios gerados), desenvolveu-se um modelo baseado na proposta de Alligham e Sandmo (1972), segundo a qual, a prática do planejamento tributário está relacionada a uma análise econômica dos custos e benefícios desta ação. As premissas utilizadas foram o período de 13 anos de discussão administrativa e judicial do débito tributário, correção do débito tributário, custo de capital de terceiros e encargos de 100% do valor do tributo (multa, juros e honorários advocatícios). Os resultados foram expandidos para diversos cenários de tempo (períodos de 8, 13 e 18 anos), encargos de 50%, 100% e 150% e variável dependente calculada com base nos valores registrados como passivos contingentes (notas explicativas), contingências fiscais prováveis (reconhecida nas demonstrações contábeis), e soma de ambas. Além disso, as análises foram feitas em nível (nominal escalonada pelo ativo total) e logaritmo. A amostra pesquisa foi composta pelas empresas brasileiras de capital aberto que fizeram parte do índice IBrX100 e abrange o período de 2008 a 2015. As análises empíricas confirmam que na maior parte dos casos a utilidade esperada do agente (valor esperado) é positiva, indicando que a tomada de decisão sobre a prática de planejamento tributário é fruto da maximização da utilidade esperada do agente, o que pode explicar os elevados números registrados de provisões e passivos contingentes nas demonstrações financeiras e notas explicativas das empresas. Além disso, identificou-se que regras rígidas de governança corporativa possuem correlação negativa com a utilidade esperada do agente, podendo ser considerada como um desincentivo à prática de planejamento tributário. Identificou-se ainda, que a variável endividamento apresentou correlação negativa com a utilidade esperada ou o valor esperado do planejamento tributário. A utilização de um modelo para avaliação da utilidade esperada/valor esperado do planejamento tributário pode contribuir para a melhor compreensão desse fenômeno e para a proposição futuras de políticas públicas. / This study examined whether the decision-making of the agents considers the costs and benefits of tax avoidance and if good practices of corporate governance reduces the engagement of managers in the practice of tax avoidance. Additionally, it was investigated the relationship between the expected utility/expected value of tax avoidance and the indebtedness of the companies. In order to measure if the practice of tax avoidance is related to the maximization of the expected utility of the agent (maximization of the benefits generated), a model based on the proposal of Alligham and Sandmo (1972) was developed, according to which the practice of tax avoidance is related to an economic analysis of the costs and benefits. The premises used were the period of time of 13 years of administrative and judicial lawsuit, correction of the tax debt, cost of debt and charges of 100% (fine, interest and legal fees) over the tax unpaid. The results were expanded to several time scenarios (periods of 8, 13 and 18 years), charges of 50%, 100% and 150% and dependent variable calculated based on the amounts recorded as contingent liabilities (footnotes), tax provisions (financial statements), and sum of both. Furthermore, the analyses were done at level (nominal staggered by total assets) and logarithm. The research sample was composed of Brazilian publicly traded companies that were part of the IBrX100 index and covers the period between 2008 and 2015. Empirical analysis confirms that in most of the cases, the expected utility of the agent (expected value) is positive, indicating that the decision on the tax avoidance practice is a result of the maximization of the agent\'s expected utility, which may explain the large numbers of provisions and contingent liabilities in the financial statements and the footnotes of the companies. In addition, it was identified that rigid rules of corporate governance practices has a negative correlation with the expected utility of the agent, and can be considered as a disincentive to the practice of tax avoidance. It was also identified that the indebtedness variable presented a negative correlation with the expected utility or the expected value of the tax avoidance. The use of a model to evaluate the expected utility/expected value of tax avoidance can contribute to a better understanding of this phenomenon and to the future proposition of public policies. Maximização da utilidade esperada Planejamento tributário Provisões e contingências fiscais Contingent liabilities Maximization of expected utility Provisions and tax Tax avoidance
217	Influence Dynamics on Social Networks Venkataramanan, Srinivasan January 2014 (has links) (PDF) With online social networks such as Facebook and Twitter becoming globally popular, there is renewed interest in understanding the structural and dynamical properties of social networks. In this thesis we study several stochastic models arising in the context of the spread of inﬂuence or information in social networks. Our objective is to provide compact and accurate quantitative descriptions of the spread processes, to understand the eﬀects of various system parameters, and to design policies for the control of such diﬀusions. One of the well established models for inﬂuence spread in social networks is the threshold model. An individual’s threshold indicates the minimum level of “inﬂuence” that must be exerted, by other members of the population engaged in some activity, before the individual will join the activity. We begin with the well-known Linear Threshold (LT) model introduced by Kempe et al. [1]. We analytically characterize the expected inﬂuence for a given initial set under the LT model, and provide an equivalent interpretation in terms of acyclic path probabilities in a Markov chain. We derive explicit optimal initial sets for some simple networks and also study the eﬀectiveness of the Pagerank [2] algorithm for the problem of inﬂuence maximization. Using insights from our analytical characterization, we then propose a computationally eﬃcient G1-sieving algorithm for inﬂuence maximization and show that it performs on par with the greedy algorithm, through experiments on a coauthorship dataset. The Markov chain characterisation gives only limited insights into the dynamics of inﬂuence spread and the eﬀects of the various parameters. We next provide such insights in a restricted setting, namely that of a homogeneous version of the LT model but with a general threshold distribution, by taking the ﬂuid limit of a probabilistically scaled version of the spread Markov process. We observe that the threshold distribution features in the ﬂuid limit via its hazard function. We study the eﬀect of various threshold distributions and show that the inﬂuence evolution can exhibit qualitatively diﬀerent behaviors, depending on the threshold distribution, even in a homogeneous setting. We show that under the exponential threshold distribution, the LT model becomes equivalent to the SIR (Susceptible-Infected-Recovered) epidemic model [3]. We also show how our approach is easily amenable to networks with heterogeneous community structures. Hundreds of millions of people today interact with social networks via their mobile devices. If the peer-to-peer radios on such devices are used, then inﬂuence spread and information spread can take place opportunistically when pairs of such devices come in proximity. In this context, we develop a framework for content delivery in mobile opportunistic networks with joint evolution of content popularity and availability. We model the evolution of inﬂuence and content spread using a multi-layer controlled epidemic model, and, using the monotonicity properties of the o.d.e.s, prove that a time-threshold policy for copying to relay nodes is delay-cost optimal. Information spread occurs seldom in isolation on online social networks. Several contents might spread simultaneously, competing for the common resource of user attention. Hence, we turn our attention to the study of competition between content creators for a common population, across multiple social networks, as a non-cooperative game. We characterize the best response function, and observe that it has a threshold structure. We obtain the Nash equilibria and study the eﬀect of cost parameters on the equilibrium budget allocation by the content creators. Another key aspect to capturing competition between contents, is to understand how a single end-user receives and processes content. Most social networks’ interface involves a timeline, a reverse chronological list of contents displayed to the user, similar to an email inbox. We study competition between content creators for visibility on a social network user’s timeline. We study a non-cooperative game among content creators over timelines of ﬁxed size, show that the equilibrium rate of operation under a symmetric setting, exhibits a non-monotonic behavior with increasing number of players. We then consider timelines of inﬁnite size, along with a behavioral model for user’s scanning behavior, while also accounting for variability in quality (inﬂuence weight) among content creators. We obtain integral equations, that capture the evolution of average inﬂuence of competing contents on a social network user’s timeline, and study various content competition formulations involving quality and quantity. Social Networks Epidemics (Mathematics) Mobile P2P Networks Influence Maximization Linear Threshold Model Mobile Opportunistic Networks (MONETs) Communication Engineering
218	Online, Submodular, and Polynomial Optimization with Discrete Structures / オンライン最適化，劣モジュラ関数最大化，および多項式関数最適化に対する離散構造に基づいたアルゴリズムの研究 Sakaue, Shinsaku 23 March 2020 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第22588号 / 情博第725号 / 新制\|\|情\|\|124(附属図書館) / 京都大学大学院情報学研究科通信情報システム専攻 / (主査)教授湊真一, 教授五十嵐淳, 教授山本章博 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Discrete structures Zero-suppressed binary decision diagrams Online combinatorial optimization Submodular function maximization Binary polynomial optimization 007
219	Using Primary Dynamic Factor Analysis on repeated cross-sectional surveys with binary responses / Primär Dynamisk Faktoranalys för upprepade tvärsnittsundersökningar med binära svar Edenheim, Arvid January 2020 (has links) With the growing popularity of business analytics, companies experience an increasing need of reliable data. Although the availability of behavioural data showing what the consumers do has increased, the access to data showing consumer mentality, what the con- sumers actually think, remain heavily dependent on tracking surveys. This thesis inves- tigates the performance of a Dynamic Factor Model using respondent-level data gathered through repeated cross-sectional surveys. Through Monte Carlo simulations, the model was shown to improve the accuracy of brand tracking estimates by double digit percent- ages, or equivalently reducing the required amount of data by more than a factor 2, while maintaining the same level of accuracy. Furthermore, the study showed clear indications that even greater performance benefits are possible. Dynamic Factor Analysis Factor Analysis Expectation Maximization Kalman Filter Kalman Smoother Cross-sectional Surveys Tracking Surveys Computer Sciences Datavetenskap (datalogi)
220	Improving the speed and quality of an Adverse Event cluster analysis with Stepwise Expectation Maximization and Community Detection Erlanson, Nils January 2020 (has links) Adverse drug reactions are unwanted effects alongside the intended benefit of a drug and might be responsible for 3-7\% of hospitalizations. Finding such reactions is partly done by analysing individual case safety reports (ICSR) of adverse events. The reports consist of categorical terms that describe the event.Data-driven identification of suspected adverse drug reactions using this data typically considers single adverse event terms, one at a time. This single term approach narrows the identification of reports and information in the reports is ignored during the search. If one instead assumes that each report is connected to a topic, then by creating a cluster of the reports that are connected to the topic more reports would be identified. More context would also be provided by virtue of the topics. This thesis takes place at Uppsala Monitoring Centre which has implemented a probabilistic model of how an ICSR, and its topic, is assumed to be generated. The parameters of the model are estimated with expectation maximization (EM), which also assigns the reports to clusters. The clusters are improved with Consensus Clustering that identify groups of reports that tend to be grouped together by several runs of EM. Additionally, in order to not cluster outlying reports all clusters below a certain size are excluded. The objective of the thesis is to improve the algorithm in terms of computational efficiency and quality, as measured by stability and clinical coherence. The convergence of EM is improved using stepwise EM, which resulted in a speed up of at least 1.4, and a decrease of the computational complexity. With all the speed improvements the speed up factor of the entire algorithm can reach 2 but is constrained by the size of the data. In order to improve the clusters' quality, the community detection algorithm Leiden is used. It is able to improve the stability with the added benefit of increasing the number of clustered reports. The clinical coherence score performs worse with Leiden. There are good reasons to further investigate the benefits of Leiden as there were suggestions that community detection identified clusters with greater resolution that still appeared clinically coherent in a posthoc analysis. Adverse Drug Reactions Pharmacovigilance Cluster Analysis Stepwise Expectation Maximization Community Detection Topic Modelling Engineering and Technology Teknik och teknologier

Search results