Dependency Syntax in the Automatic Detection of Irony and Stance

Cignarella, Alessandra Teresa 29 November 2021 (has links)
[ES] The present thesis is part of the broad panorama of studies of Natural Language Processing (NLP). In particular, it is a work of Computational Linguistics (CL) designed to study in depth the contribution of syntax in the field of sentiment analysis and, therefore, to study texts extracted from social media or, more generally, online content. Furthermore, given the recent interest of the scientific community in the Universal Dependencies (UD) project, which proposes a morphosyntactic annotation format aimed at creating a "universal" representation of the phenomena of morphology and syntax in a manifold of languages, in this work we made use of this format, thinking of a study in a multilingual perspective (Italian, English, French and Spanish). In this work we will provide an exhaustive presentation of the morphosyntactic annotation format of UD, in particular underlining the most relevant issues regarding their application to UGC. Two tasks will be presented, and used as case studies, in order to test the research hypotheses: the first case study will be in the field of automatic Irony Detection and the second in the area of Stance Detection. In both cases, historical notes will be provided that can serve as a context for the reader, an introduction to the problems faced will be outlined and the activities proposed in the computational linguistics community will be described. Furthermore, particular attention will be paid to the resources currently available as well as to those developed specifically for the study of the aforementioned phenomena. Finally, through the description of a series of experiments, both within evaluation campaigns and within independent studies, I will try to describe the contribution that syntax can provide to the resolution of such tasks. This thesis is a revised collection of my three-year PhD career and collocates within the growing trend of studies devoted to make Artificial Intelligence results more explainable, going beyond the achievement of highest scores in performing tasks, but rather making their motivations understandable and comprehensible for experts in the domain. The novel contribution of this work mainly consists in the exploitation of features that are based on morphology and dependency syntax, which were used in order to create vectorial representations of social media texts in various languages and for two different tasks. Such features have then been paired with a manifold of machine learning classifiers, with some neural networks and also with the language model BERT. Results suggest that fine-grained dependency-based syntactic information is highly informative for the detection of irony, and less informative for what concerns stance detection. Nonetheless, dependency syntax might still prove useful in the task of stance detection if firstly irony detection is considered as a preprocessing step. I also believe that the dependency syntax approach that I propose could shed some light on the explainability of a difficult pragmatic phenomenon such as irony. / [CA] La presente tesis se enmarca dentro del amplio panorama de estudios relacionados con el Procesamiento del Lenguaje Natural (NLP). En concreto, se trata de un trabajo de Lingüística Computacional (CL) cuyo objetivo principal es estudiar en profundidad la contribución de la sintaxis en el campo del análisis de sentimientos y, en concreto, aplicado a estudiar textos extraídos de las redes sociales o, más en general, de contenidos online. Además, dado el reciente interés de la comunidad científica por el proyecto Universal Dependencies (UD), en el que se propone un formato de anotación morfosintáctica destinado a crear una representación "universal" de la morfología y sintaxis aplicable a diferentes idiomas, en este trabajo se utiliza este formato con el propósito de realizar un estudio desde una perspectiva multilingüe (italiano, inglés, francés y español). En este trabajo se presenta una descripción exhaustiva del formato de anotación morfosintáctica de UD, en particular, subrayando las cuestiones más relevantes en cuanto a su aplicación a los UGC generados en las redes sociales. El objetivo final es analizar y comprobar si estas anotaciones morfosintácticas sirven para obtener información útil para los modelos de detección de la ironía y del stance o posicionamiento. Se presentarán dos tareas y se utilizarán como ejemplos de estudio para probar las hipótesis de la investigación: el primer caso se centra en el área de la detección automática de la ironía y el segundo en el área de la detección del stance o posicionamiento. En ambos casos, se proporcionan los antecendentes y trabajos relacionados notas históricas que pueden servir de contexto para el lector, se plantean los problemas encontrados y se describen las distintas actividades propuestas para resolver estos problemas en la comunidad de la lingüística computacional. Se presta especial atención a los recursos actualmente disponibles, así como a los desarrollados específicamente para el estudio de los fenómenos antes mencionados. Finalmente, a través de la descripción de una serie de experimentos, llevados a cabo tanto en campañas de evaluación como en estudios independientes, se describe la contribución que la sintaxis puede brindar a la resolución de esas tareas. Esta tesis es el resultado de toda la investigación que he llevado a cabo durante mi doctorado en una colección revisada de mi carrera de doctorado de los últimos tres años y medio, y se ubica dentro de la tendencia creciente de estudios dedicados a hacer que los resultados de la Inteligencia Artificial sean más explicables, yendo más allá del logro de puntajes más altos en la realización de tareas, sino más bien haciendo comprensibles sus motivaciones y qué los procesos sean más comprensibles para los expertos en el dominio. La contribución principal y más novedosa de este trabajo consiste en la explotación de características (o rasgos) basadas en la morfología y la sintaxis de dependencias, que se utilizaron para crear las representaciones vectoriales de textos procedentes de redes sociales en varios idiomas y para dos tareas diferentes. A continuación, estas características se han combinado con una variedad de clasificadores de aprendizaje automático, con algunas redes neuronales y también con el modelo de lenguaje BERT. Los resultados sugieren que la información sintáctica basada en dependencias utilizada es muy informativa para la detección de la ironía y menos informativa en lo que respecta a la detección del posicionamiento. No obstante, la sintaxis basada en dependencias podría resultar útil en la tarea de detección del posicionamiento si, en primer lugar, la detección de ironía se considera un paso previo al procesamiento en la detección del posicionamiento. También creo que el enfoque basado casi completamente en sintaxis de dependencias que propongo en esta tesis podría ayudar a explicar mejor un fenómeno prag / [EN] La present tesi s'emmarca dins de l'ampli panorama d'estudis relacionats amb el Processament del Llenguatge Natural (NLP). En concret, es tracta d'un treball de Lingüística Computacional (CL), l'objectiu principal del qual és estudiar en profunditat la contribució de la sintaxi en el camp de l'anàlisi de sentiments i, en concret, aplicat a l'estudi de textos extrets de les xarxes socials o, més en general, de continguts online. A més, el recent interès de la comunitat científica pel projecte Universal Dependències (UD), en el qual es proposa un format d'anotació morfosintàctica destinat a crear una representació "universal" de la morfologia i sintaxi aplicable a diferents idiomes, en aquest treball s'utilitza aquest format amb el propòsit de realitzar un estudi des d'una perspectiva multilingüe (italià, anglès, francès i espanyol). En aquest treball es presenta una descripció exhaustiva del format d'anotació morfosintàctica d'UD, en particular, posant més èmfasi en les qüestions més rellevants pel que fa a la seva aplicació als UGC generats a les xarxes socials. L'objectiu final és analitzar i comprovar si aquestes anotacions morfosintàctiques serveixen per obtenir informació útil per als sistemes de detecció de la ironia i del stance o posicionament. Es presentaran dues tasques i s'utilitzaran com a exemples d'estudi per provar les hipòtesis de la investigació: el primer cas se centra en l'àrea de la detecció automàtica de la ironia i el segon en l'àrea de la detecció del stance o posicionament. En tots dos casos es proporcionen els antecedents i treballs relacionats que poden servir de context per al lector, es plantegen els problemes trobats i es descriuen les diferents activitats proposades per resoldre aquests problemes en la comunitat de la lingüística computacional. Es fa especialment referència als recursos actualment disponibles, així com als desenvolupats específicament per a l'estudi dels fenòmens abans esmentats. Finalment, a través de la descripció d'una sèrie d'experiments, duts a terme tant en campanyes d'avaluació com en estudis independents, es descriu la contribució que la sintaxi pot oferir a la resolució d'aquestes tasques. Aquesta tesi és el resultat de tota la investigació que he dut a terme durant el meu doctorat els últims tres anys i mig, i se situa dins de la tendència creixent d'estudis dedicats a fer que els resultats de la Intel·ligència Artificial siguin més explicables, que vagin més enllà de l'assoliment de puntuacions més altes en la realització de tasques, sinó més aviat fent comprensibles les seves motivacions i què els processos siguin més comprensibles per als experts en el domini. La contribució principal i més nova d'aquest treball consisteix en l'explotació de característiques (o trets) basades en la morfologia i la sintaxi de dependències, que s'utilitzen per crear les representacions vectorials de textos procedents de xarxes socials en diversos idiomes i per a dues tasques diferents. A continuació, aquestes característiques s'han combinat amb una varietat de classificadors d'aprenentatge automàtic, amb algunes xarxes neuronals i també amb el model de llenguatge BERT. Els resultats suggereixen que la informació sintàctica utilitzada basada en dependències és molt informativa per a la detecció de la ironia i menys informativa pel que fa a la detecció del posicionament. Malgrat això, la sintaxi basada en dependències podria ser útil en la tasca de detecció del posicionament si, en primer lloc, la detecció d'ironia es considera un pas previ al processament en la detecció del posicionament. També crec que l'enfocament basat gairebé completament en sintaxi de dependències que proposo en aquesta tesi podria ajudar a explicar millor un fenomen pragmàtic tan difícil de detectar i d'interpretar com la ironia. / Cignarella, AT. (2021). Dependency Syntax in the Automatic Detection of Irony and Stance [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/177639 / TESIS

Cross-Lingual and Genre-Supervised Parsing and Tagging for Low-Resource Spoken Data

Fosteri, Iliana January 2023 (has links)
Dealing with low-resource languages is a challenging task, because of the absence of sufficient data to train machine-learning models to make predictions on these languages. One way to deal with this problem is to use data from higher-resource languages, which enables the transfer of learning from these languages to the low-resource target ones. The present study focuses on dependency parsing and part-of-speech tagging of low-resource languages belonging to the spoken genre, i.e., languages whose treebank data is transcribed speech. These are the following: Beja, Chukchi, Komi-Zyrian, Frisian-Dutch, and Cantonese. Our approach involves investigating different types of transfer languages, employing MACHAMP, a state-of-the-art parser and tagger that uses contextualized word embeddings, mBERT, and XLM-R in particular. The main idea is to explore how the genre, the language similarity, none of the two, or the combination of those affect the model performance in the aforementioned downstream tasks for our selected target treebanks. Our findings suggest that in order to capture speech-specific dependency relations, we need to incorporate at least a few genre-matching source data, while language similarity-matching source data are a better candidate when the task at hand is part-of-speech tagging. We also explore the impact of multi-task learning in one of our proposed methods, but we observe minor differences in the model performance.

Analyzing and Reducing Compilation Times for C++ Programs

Mivelli, Dennis January 2022 (has links)
Software companies often choose to develop in C++ because of the high performance that the language offers. Facilitated by static compilation and powerful optimization options, runtime performance is paid for with compilation time. Although the trade-off is inevitable to some extent, building very large C++ programs from scratch can take up to several hours if extra care is not taken during development. This thesis analyzes compilation times for C++ programs and shows how they can be reduced with the help of design patterns, implementation hiding, and framework related fixes. The results presented prove that compilation times can be decreased significantly with no drawbacks to the maintainability of a program. An in-depth analysis of compilation times and dependencies has been conducted for two large software modules from a representative company. Both modules take over an hour of CPU time each to compile. The time consumption for different compiler activities, such as parsing, preprocessing, and runtime optimization tasks have been measured for the modules. The compilation times for unit tests and mocks which use the GoogleTest framework have been analyzed. A simple method that may reduce compilation times by up to 50% for programs that use GoogleTest is presented. A dependency metric has been created, based on the number of include statements found recursively throughout a program. The dependency metric was found to be connected to compilation time for the two analyzed modules. Other factors that can influence compilation times are also shown, such as runtime optimization options, and the use of templates. Experiments which show how a typical usage of templates can drastically increase compilation times are presented. In addition, a solution which allows templates to be used while avoiding code bloat across translation units is reviewed. The solution effectively rivals non-template code in terms of compilation time. The Pointer to Implementation (PImpl) and Dependency Injection design patterns have been used to refactor a small program. Both design patterns performed well, reducing the total compilation time and total compiler memory usage by 70%. A program that detects dependency cycles has been created, but no cycles were found in any of two modules from the representative company.

Correlation between strategic objectives and operational plans of the University of South Africa with specific reference to the Directorate: Student Admissions and Registrations

Harding, Richard Cornelius 04 1900 (has links)
The major focus and question emanating from the research is: to what extent do the operational action plans, policies, functions, procedures and activities as well as their implementation within the Directorate: Student Admissions and Registrations correlate with the strategic objectives of the University of South Africa (Unisa)? In alignment with the above, the major challenge of the study was to identify adequate and appropriate approaches to ensure appropriate correlation levels between strategic objectives and their successful implementation relevant to the Directorate: Student Admissions and Registrations. The challenge of every Departmental Head is to turn theory into practice; to make something happen and to translate strategic plans into real business results. This will be accomplished only when there is synergy or connectivity between strategic and operational planning towards effective implementation. Various literature reviews and research topics on strategic management focus either on strategic planning or strategic implementation as separate identities. Few publications address the challenge of connecting the pursuit of strategic objectives with operational plans. Even fewer literature reviews indicate the relationship or correlation levels between strategic objectives and operational plans of an organisation; the desirable or appropriate level thereof, to ensure the effective pursuit of strategic objectives. The outcomes of this study could contribute to the identification of an appropriate approach and measurement criteria to ensure connectivity/alignment between specific strategic objectives and operational plans relevant to the Directorate: Student Admissions and Registrations. By doing this, the strategic objectives are effectively and efficiently promoted to those responsible for carrying out the execution plan. The researcher has adopted a comprehensively-integrated-aligned-strategic-processmanagement- approach as part of the standardised operational plans of the Directorate: Student Admissions and Registrations so as to ensure more effective and efficient (appropriate) correlation levels in respect of specific strategic objectives relevant to the Directorate: Student Admissions and Registrations due to a lack of correlation in some instances. The above approach represents a total view of an organisation‟s strategic management and control systems and consists of the strategic planning, operational plans and resultsmanagement plans. The mentioned approach will also consist of a measurement criterion which identifies critical enablers, dependencies and drivers to ensure vertical and horizontal alignment in respect of original planning (the what and why) with the implementation plans (when, how and by whom).The integrated-aligned-strategic-management-process-approach enforces the timely availability of major enablers, dependencies and drivers necessary to support the execution of activities, related to specific strategic objectives. It also identifies the possible lack thereof prior to the implementation of strategic plans. Specific alternatives or workarounds can be identified to ensure continuity in respect of the implementation processes related to specific strategic objectives. In this way, the above approach will enhance the effective and efficient management and coordination of an organisation to drive intended strategic outcomes within a specific process, taking into account project management-driven principles within a specific sequence of activities (grouping together what belongs together). The latter will involve all roleplayers in the work situation accountable for the implementation process (creating ownership). By doing this, duplication and overlapping of activities will be eliminated and connectivity/alignment between specific strategic objectives and their implementation will be enforced. The focus falls on the entire key/core process and cycle, producing outcomes of success in respect of the implementation of objectives (the right people will be doing the right things at the right time). In this way, the above approach will enhance the effective and efficient management and coordination of an organisation to drive intended strategic outcomes within a specific process, taking into account project management-driven principles within a specific sequence of activities (grouping together what belongs together). The latter will involve all roleplayers in the work situation accountable for the implementation process (creating ownership). By doing this, duplication and overlapping of activities will be eliminated and connectivity/alignment between specific strategic objectives and their implementation will be enforced. The focus falls on the entire key/core process and cycle, producing outcomes of success in respect of the implementation of objectives (the right people will be doing the right things at the right time). The integrated-aligned-strategic-management-process-approach enforces the timely availability of major enablers, dependencies and drivers necessary to support the execution of activities, related to specific strategic objectives. It also identifies the possible lack thereof prior to the implementation of strategic plans. Specific alternatives or workarounds can be identified to ensure continuity in respect of the implementation processes related to specific strategic objectives. In this way, the above approach will enhance the effective and efficient management and coordination of an organisation to drive intended strategic outcomes within a specific process, taking into account project management-driven principles within a specific sequence of activities (grouping together what belongs together). The latter will involve all roleplayers in the work situation accountable for the implementation process (creating ownership). By doing this, duplication and overlapping of activities will be eliminated and connectivity/alignment between specific strategic objectives and their implementation will be enforced. The focus falls on the entire key/core process and cycle, producing outcomes of success in respect of the implementation of objectives (the right people will be doing the right things at the right time). The integrated-aligned-strategic-management-process-approach enforces the timely availability of major enablers, dependencies and drivers necessary to support the execution of activities, related to specific strategic objectives. It also identifies the possible lack thereof prior to the implementation of strategic plans. Specific alternatives or workarounds can be identified to ensure continuity in respect of the implementation processes related to specific strategic objectives. In this way, the above approach will enhance the effective and efficient management and coordination of an organisation to drive intended strategic outcomes within a specific process, taking into account project management-driven principles within a specific sequence of activities (grouping together what belongs together). The latter will involve all roleplayers in the work situation accountable for the implementation process (creating ownership). By doing this, duplication and overlapping of activities will be eliminated and connectivity/alignment between specific strategic objectives and their implementation will be enforced. The focus falls on the entire key/core process and cycle, producing outcomes of success in respect of the implementation of objectives (the right people will be doing the right things at the right time). / Public Administration & Management / M.A. (Public Administration)

內外部技術網路與組織知識流通之研究-以TFT LCD產業為例 / Knowledge flow process of TFT LCD industry in Taiwan from the perspectives of organizational internal networking and external networking

楊晴媚, Yang, Ching-Mei Unknown Date (has links)
台灣TFT LCD產業在國內六家主要廠商的帶動下,由草創期邁向快速成長期,成為世界舞台上矚目的焦點。本研究以台灣TFT LCD產業為研究對象,一方面為台灣TFT LCD產業記錄略盡微薄之力,一方面藉著技術知識特質與內外部技術網路的探討,深入瞭解台灣TFT LCD產業組織知識流通以及能耐打造的情況。 本研究以個案訪談為主要研究方式,共訪問六家主要的TFT LCD廠商。本研究以「技術知識特質」、「外部技術網路」、「內部技術網路」探討其對台灣TFT LCD產業廠商「組織知識流通」的影響。本研究之研究發現如下: 壹、技術知識特質對外部技術網路之影響 一、本研究發現當組織技術知識路徑相依度不同時,與技術知識吸收來源關係緊密程度亦不相同。當組織技術知識路徑相依度愈高時,與技術知識吸收來源關係愈寬鬆。當組織技術知識路徑相依度愈低時,與技術知識吸收來源關係愈緊密。 二、本研究發現當組織技術知識路徑相依度不同時,組織在吸收供應商的技術知識上採取不同的模式。當台灣TFT LCD廠商技術知識路徑相依度愈高時,其在吸收技轉廠商的技術知識上採取共同開發模式。當台灣TFT LCD廠商技術知識路徑相依度愈低時,其在吸收設備供應商的技術知識上採取交付模式,且設備供應商會提供文件資料、派人駐廠協助。 貳、外部技術網路對組織知識流通之影響 一、本研究發現當組織與技術知識來源關係緊密程度不同時,組織在管理知識吸收上會採取不同的互動方式。當台灣TFT LCD廠商與技術知識來源關係緊密時,其在管理知識吸收上會採取持續互動的方式。當台灣TFT LCD廠商與技術知識來源關係寬鬆時,其在管理知識吸收上不會採取持續互動的方式。 參、內部技術網路對組織知識流通之影響 一、本研究發現知識經驗的分享有助於組織知識在轉換過程中創造與蓄積。知識經驗透過師徒制、在職訓練分享,有助於知識在共同化過程中蓄積與傳播。知識經驗透過文件資料撰寫,有助於知識在外化過程中蓄積與傳播。知識經驗透過面對面溝通分享,包括會議召開、空間設計,有助於知識在結合過程中創造。知識經驗透過內部網路分享,有助於知識在結合過程中蓄積與傳播。 二、本研究發現組織擁有知識創造型團員有助於組織知識的流通。台灣TFT LCD產業的知識操作員在知識共同化過程中,藉著親赴日本實地學習促進組織知識的吸收。台灣TFT LCD產業的知識操作員在知識外化過程中,藉著將受訓內容進行文件化記錄,促進組織知識的蓄積。台灣TFT LCD產業的知識主管屬於T型人,透過會議傳遞、溝通組織願景,有助於知識的創造。 三、本研究發現組織內研發部門與生產部門間的連結與轉移,有助於解決知識創造過程中的問題。台灣TFT LCD各家廠商皆藉著研發部門與生產部門間人員的互動來促成問題的解決。 肆、台灣TFT LCD產業創新特色 一、本研究發現台灣TFT LCD產業這兩年快速發展的過程中,廠商為了避免技術風險與市場風險,採取下列的作法:以共同開發、獨家授權、合資等模式,與獨有的日本技轉廠商維持緊密的合作關係來降低技術風險;以產品銷售母公司、為技轉廠商進行代工等方式來降低市場風險。 二、本研究發現當母公司進行跨業投資台灣TFT LCD產業時,會透過集團協助組織知識流通,其中包括高階主管、生產人才的調任,以及技術知識、管理經驗的提供。 三、本研究發現在產品開發能力移轉上,台灣TFT LCD廠商成功邁向第二個層次—調適與零件本土化的能力移轉,並建構本身在實體系統、管理系統、技術與知識、價值觀各方面的核心能力。 四、本研究發現在台灣TFT LCD產業崛起過程中,工研院除了提供技術知識給廠商外,亦扮演人才提供的角色,協助台灣TFT LCD產業組織知識的流通。 五、本研究發現在台灣TFT LCD產業崛起過程中,創新的CEO透過容忍智慧型失敗、塑造關懷(Care)與學習的組織環境,協助組織內知識的流通。

Towards better understanding and improving optimization in recurrent neural networks

Kanuparthi, Bhargav 07 1900 (has links)
Recurrent neural networks (RNN) are known for their notorious exploding and vanishing gradient problem (EVGP). This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because it prevents important gradient components from being back-propagated adequately over a large number of steps. The papers written in this work formalizes gradient propagation in parametric and semi-parametric RNNs to gain a better understanding towards the source of this problem. The first paper introduces a simple stochastic algorithm (h-detach) that is specific to LSTM optimization and targeted towards addressing the EVGP problem. Using this we show significant improvements over vanilla LSTM in terms of convergence speed, robustness to seed and learning rate, and generalization on various benchmark datasets. The next paper focuses on semi-parametric RNNs and self-attentive networks. Self-attention provides a way by which a system can dynamically access past states (stored in memory) which helps in mitigating vanishing of gradients. Although useful, it is difficult to scale as the size of the computational graph grows quadratically with the number of time steps involved. In the paper we describe a relevancy screening mechanism, inspired by the cognitive process of memory consolidation, that allows for a scalable use of sparse self-attention with recurrence while ensuring good gradient propagation. / Les réseaux de neurones récurrents (RNN) sont connus pour leur problème de gradient d'explosion et de disparition notoire (EVGP). Ce problème devient plus évident dans les tâches où les informations nécessaires pour les résoudre correctement existent sur de longues échelles de temps, car il empêche les composants de gradient importants de se propager correctement sur un grand nombre d'étapes. Les articles écrits dans ce travail formalise la propagation du gradient dans les RNN paramétriques et semi-paramétriques pour mieux comprendre la source de ce problème. Le premier article présente un algorithme stochastique simple (h-detach) spécifique à l'optimisation LSTM et visant à résoudre le problème EVGP. En utilisant cela, nous montrons des améliorations significatives par rapport au LSTM vanille en termes de vitesse de convergence, de robustesse au taux d'amorçage et d'apprentissage, et de généralisation sur divers ensembles de données de référence. Le prochain article se concentre sur les RNN semi-paramétriques et les réseaux auto-attentifs. L'auto-attention fournit un moyen par lequel un système peut accéder dynamiquement aux états passés (stockés en mémoire), ce qui aide à atténuer la disparition des gradients. Bien qu'utile, il est difficile à mettre à l'échelle car la taille du graphe de calcul augmente de manière quadratique avec le nombre de pas de temps impliqués. Dans l'article, nous décrivons un mécanisme de criblage de pertinence, inspiré par le processus cognitif de consolidation de la mémoire, qui permet une utilisation évolutive de l'auto-attention clairsemée avec récurrence tout en assurant une bonne propagation du gradient.

On challenges in training recurrent neural networks

Anbil Parthipan, Sarath Chandar 11 1900 (has links)
Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent. / In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network.

Inter-teamsamordning i skagila projekt : En fallstudie på Avanza Bank för att möta beroenden i projektprocessen / Inter-team Coordination in Scagile Projects : A case study at Avanza Bank to adress dependencies in the project process

Agorelius, Malin, Ekström, Emma January 2021 (has links)
Användandet av agila metodiker har ökat under de senaste decennierna. Detta har lett till en uppskalning av agila metodiker då även stora organisationer vill uppnå fördelarna som kommer med det agila arbetssättet. Att skala upp agila metodiker, och använda dessa i storskaligt agila miljöer (författarnas koncept skagila miljöer, som återfinns i sektionen ’Begreppet ’skagil’'), kommer dock med flera nya organisatoriska utmaningar. En utmaning, som omnämns i både litteratur och i arbetets empiriska undersökning på företaget Avanza, är inter-teamsamordning i skagila miljöer. Avanza har identifierat problemen med beroenden mellan team i skagila mjukvaruutvecklingsprojekt. För att möta problemet med beroenden initierades det här arbetet med syftet att, utifrån Avanzas nuvarande projekt design, undersöka hur teamöverskridande arbete kan samordnas för att möta beroenden i projektprocessen. För att uppnå syftet genomfördes en fallstudie på Avanza innehållande intervjuer med tolv respondenter och observation av interna dokument. Den empiriska undersökningen bekräftade ursprungsproblematiken gällande inter-teamsamordning och bidrog även med information om företagets nuvarande projektdesign. Resultatet visade att projektorganisationen verkar som en hybrid organisation med starka, agila inslag. Dock visade sig Avanza uppleva sin projektdesignen som helt agil. Vidare ansågs projektets beroenden bidra till agilt slöseri, vilket påverkar både produktivitet och effektivitet negativt i mjukvaruutvecklingsprocessen. Fyra huvudområden av agilt slöseri identifierades vilka var väntan, rörelse, defekter och tilläggsprocesser. Genom att klustra ihop liknande slöseri framtogs tre problemområden kopplade till Avanzas projektprocess, nämligen ’viss frånvaro av proaktivt angreppssätt och planering’, ’viss frånvaro av forum för hantering av inter-teamberoenden’ och ’skillnader i implementering av agila metodiker och projektprioritering bland teamen’. För att möta problemen fastställdes sex åtgärder, nämligen implementering av en mer proaktiv projektledarroll, anammning av hybridkulturen, skapande av rollspecifika team, implementering av arrangerade forum för teamsynkronisering, kodifiering och utveckling av befintliga mekanismer och samordning samt skapande av ett gemensamt förhållningssätt till agila principer i projekt. Arbetets slutsatser är till viss del generaliserbara och skulle kunna adopteras av andra företag eller projektorganisationer som har liknande problem och projektdesign som Avanza. Dock krävs en viss ansträngning för att först identifiera vilken projektdesign intressentföretag har samt att identifiera projektrelaterat slöseri. Avanza är även verksamma inom tech-branschen där projektorganisationen verkar kring mjukvaruutveckling. Därför kan det antas att slutsatserna mer sannolikt, passar andra organisationer som jobbar med mjukvaruutveckling. / The usage of agile methodologies has rapidly increased over the last decades. This has led to an upscaling of agile methods since larger organizations want to gain the benefits of the agile way of working. However, this has not come without issues, and using agile at scale (authors’ concept scagile, in upcoming section ’Begreppet ’skagil”) has introduced new organizational challenges. One challenge that is mentioned both in literature and in the empirical findings at the case company, Avanza, is inter-team coordination in scagile environments. Today Avanza is struggling with dependencies between teams in scaled agile software projects. To address this issue this study was initiated with the purpose to, based on Avanza’s current project design, investigate how cross-team collaboration could be coordinated to face and overcome dependencies in the project process. To accomplish this a case study, containing interviews with twelve respondents and observation of internal documents, was made. The empirical findings confirmed the original issues related to inter-team coordination and also provided valuable information about the company’s project design. Regarding the project design the findings showed that the project organization is a hybrid organization with strong agile influences. However, the alleged perception of the project design was a fully agile organization. Further, the dependencies in the projects seem to cause agile waste, which has a negative influence on productivity and efficiancy in software projects. Four main areas of agile waste were detected, namely waiting, motion, defects and extra processes. By clustering similar waste, three main problem areas were detected, viz ’a certain absence of a proactive approach and planning’, ‘a certain absence of forums for handling inter-team dependencies’, and ‘differences between teams regarding the implementation and usage of agile principles, and project prioritization’. To face these issues, six measures were determined, namely implementation of a more proactive project management approach, embracing the hybrid culture, creating role specific teams, arranging forums for team synchronization, codifying and developing the current coordination mechanisms and deciding on a shared approach for project methodologies. The findings of this study is to some extent generalizable and could be adopted by other companies, or project organizations, that are struggling with the same problem areas and have the same project design as Avanza. However, some effort is required to first determine current project design and to identify project related waste. Further, the client company is operative in the fin-tech industry where the project organization orbits around software development. Therefore it can be assumed that the findings are more likely to fit another software organization.

Realisierung einer Schedulingumgebung für gemischt-parallele Anwendungen und Optimierung von layer-basierten Schedulingalgorithmen

Kunis, Raphael 20 January 2011 (has links)
Eine Herausforderung der Parallelverarbeitung ist das Erreichen von Skalierbarkeit großer paralleler Anwendungen für verschiedene parallele Systeme. Das zentrale Problem ist, dass die Ausführung einer Anwendung auf einem parallelen System sehr gut sein kann, die Portierung auf ein anderes System in der Regel jedoch zu schlechten Ergebnissen führt. Durch die Verwendung des Programmiermodells der parallelen Tasks mit Abhängigkeiten kann die Skalierbarkeit für viele parallele Algorithmen deutlich verbessert werden. Die Programmierung mit parallelen Tasks führt zu Task-Graphen mit Abhängigkeiten zur Darstellung einer parallelen Anwendung, die auch als gemischt-parallele Anwendung bezeichnet wird. Die Grundlage für eine effiziente Abarbeitung einer gemischt-parallelen Anwendung bildet ein geeigneter Schedule, der eine effiziente Abbildung der parallelen Tasks auf die Prozessoren des parallelen Systems vorgibt. Für die Berechnung eines Schedules werden Schedulingalgorithmen eingesetzt. Ein zentrales Problem bei der Bestimmung eines Schedules für gemischt-parallele Anwendungen besteht darin, dass das Scheduling bereits für Single-Prozessor-Tasks mit Abhängigkeiten und ein paralleles System mit zwei Prozessoren NP-hart ist. Daher existieren lediglich Approximationsalgorithmen und Heuristiken um einen Schedule zu berechnen. Eine Möglichkeit zur Berechnung eines Schedules sind layerbasierte Schedulingalgorithmen. Diese Schedulingalgorithmen bilden zuerst Layer unabhängiger paralleler Tasks und berechnen den Schedule für jeden Layer separat. Eine Schwachstelle dieser Schedulingalgorithmen ist das Zusammenfügen der einzelnen Schedules zum globalen Schedule. Der vorgestellte Algorithmus Move-blocks bietet eine elegante Möglichkeit das Zusammenfügen zu verbessern. Dies geschieht durch eine Verschmelzung der Schedules aufeinander folgender Layer. Obwohl eine Vielzahl an Schedulingalgorithmen für gemischt-parallele Anwendungen existiert, gibt es bislang keine umfassende Unterstützung des Schedulings durch Programmierwerkzeuge. Im Besonderen gibt es keine Schedulingumgebung, die eine Vielzahl an Schedulingalgorithmen in sich vereint. Die Vorstellung der flexiblen, komponentenbasierten und erweiterbaren Schedulingumgebung SEParAT ist der zweite Fokus dieser Dissertation. SEParAT unterstützt verschiedene Nutzungsszenarien, die weit über das reine Scheduling hinausgehen, z.B. den Vergleich von Schedulingalgorithmen und die Erweiterung und Realisierung neuer Schedulingalgorithmen. Neben der Vorstellung der Nutzungsszenarien werden sowohl die interne Verarbeitung eines Schedulingdurchgangs als auch die komponentenbasierte Softwarearchitektur detailliert vorgestellt.

A Bayesian Network methodology for railway risk, safety and decision support

Mahboob, Qamar 14 February 2014 (has links)
For railways, risk analysis is carried out to identify hazardous situations and their consequences. Until recently, classical methods such as Fault Tree Analysis (FTA) and Event Tree Analysis (ETA) were applied in modelling the linear and logically deterministic aspects of railway risks, safety and reliability. However, it has been proven that modern railway systems are rather complex, involving multi-dependencies between system variables and uncertainties about these dependencies. For train derailment accidents, for instance, high train speed is a common cause of failure; slip and failure of brake applications are disjoint events; failure dependency exists between the train protection and warning system and driver errors; driver errors are time dependent and there is functional uncertainty in derailment conditions. Failing to incorporate these aspects of a complex system leads to wrong estimations of the risks and safety, and, consequently, to wrong management decisions. Furthermore, a complex railway system integrates various technologies and is operated in an environment where the behaviour and failure modes of the system are difficult to model using probabilistic techniques. Modelling and quantification of the railway risk and safety problems that involve dependencies and uncertainties such as mentioned above are complex tasks. Importance measures are useful in the ranking of components, which are significant with respect to the risk, safety and reliability of a railway system. The computation of importance measures using FTA has limitation for complex railways. ALARP (As Low as Reasonably Possible) risk acceptance criteria are widely accepted as ’\'best practice’’ in the railways. According to the ALARP approach, a tolerable region exists between the regions of intolerable and negligible risks. In the tolerable region, risk is undertaken only if a benefit is desired. In this case, one needs to have additional criteria to identify the socio-economic benefits of adopting a safety measure for railway facilities. The Life Quality Index (LQI) is a rational way of establishing a relation between the financial resources utilized to improve the safety of an engineering system and the potential fatalities that can be avoided by safety improvement. This thesis shows the application of the LQI approach to quantifying the social benefits of a number of safety management plans for a railway facility. We apply Bayesian Networks and influence diagrams, which are extensions of Bayesian Networks, to model and assess the life safety risks associated with railways. Bayesian Networks are directed acyclic probabilistic graphical models that handle the joint distribution of random variables in a compact and flexible way. In influence diagrams, problems of probabilistic inference and decision making – based on utility functions – can be combined and optimized, especially, for systems with many dependencies and uncertainties. The optimal decision, which maximizes the total benefits to society, is obtained. In this thesis, the application of Bayesian Networks to the railway industry is investigated for the purpose of improving modelling and the analysis of risk, safety and reliability in railways. One example application and two real world applications are presented to show the usefulness and suitability of the Bayesian Networks for the quantitative risk assessment and risk-based decision support in reference to railways.:ACKNOWLEDGEMENTS IV ABSTRACT VI ZUSAMMENFASSUNG VIII LIST OF FIGURES XIV LIST OF TABLES XVI CHAPTER 1: Introduction 1 1.1 Need to model and quantify the causes and consequences of hazards on railways 1 1.2 State-of-the art techniques in the railway 2 1.3 Goals and scope of work 4 1.4 Existing work 6 1.5 Outline of the thesis 7 CHAPTER 2: Methods for safety and risk analysis 10 2.1 Introduction 10 2.1.1 Simplified risk analysis 12 2.1.2 Standard risk analysis 12 2.1.3 Model-based risk analysis 12 2.2 Risk Matrix 14 2.2.1 Determine the possible consequences 14 2.2.2 Likelihood of occurrence 15 2.2.3 Risk scoring matrix 15 2.3 Failure Modes & Effect Analysis – FMEA 16 2.3.1 Example application of FMEA 17 2.4 Fault Tree Analysis – FTA 19 2.5 Reliability Block Diagram – RBD 22 2.6 Event Tree Analysis – ETA 24 2.7 Safety Risk Model – SRM 25 2.8 Markov Model – MM 27 2.9 Quantification of expected values 31 2.9.1 Bayesian Analysis – BA 35 2.9.2 Hazard Function – HF 39 2.9.3 Monte Carlo (MC) Simulation 42 2.10 Summary 46 CHAPTER 3: Introduction to Bayesian Networks 48 3.1 Terminology in Bayesian Networks 48 3.2 Construction of Bayesian Networks 49 3.3 Conditional independence in Bayesian Networks 51 3.4 Joint probability distribution in Bayesian Networks 52 3.5 Probabilistic Inference in Bayesian Networks 53 3.6 Probabilistic inference by enumeration 54 3.7 Probabilistic inference by variable elimination 55 3.8 Approximate inference for Bayesian Networks 57 3.9 Dynamic Bayesian Networks 58 3.10 Influence diagrams (IDs) 60 CHAPTER 4: Risk acceptance criteria and safety targets 62 4.1 Introduction 62 4.2 ALARP (As Low As Reasonably Possible) criteria 62 4.3 MEM (Minimum Endogenous Mortality) criterion 63 4.4 MGS (Mindestens Gleiche Sicherheit) criteria 64 4.5 Safety Integrity Levels (SILs) 65 4.6 Importance Measures (IMs) 66 4.7 Life Quality Index (LQI) 68 4.8 Summary 72 CHAPTER 5: Application of Bayesian Networks to complex railways: A study on derailment accidents 73 5.1 Introduction 73 5.2 Fault Tree Analysis for train derailment due to SPAD 74 5.2.1 Computation of importance measures using FTA 75 5.3 Event Tree Analysis (ETA) 78 5.4 Mapping Fault Tree and Event Tree based risk model to Bayesian Networks 79 5.4.1 Computation of importance measures using Bayesian Networks 81 5.5 Risk quantification 82 5.6 Advanced aspects of example application 83 5.6.1 Advanced aspect 1: Common cause failures 83 5.6.2 Advanced aspect 2: Disjoint events 84 5.6.3 Advanced aspect 3: Multistate system and components 84 5.6.4 Advanced aspect 4: Failure dependency 85 5.6.5 Advanced aspect 5: Time dependencies 85 5.6.6 Advanced aspect 6: Functional uncertainty and factual knowledge 85 5.6.7 Advanced aspect 7: Uncertainty in expert knowledge 86 5.6.8 Advanced aspect 8: Simplifications and dependencies in Event Tree Analysis 86 5.7 Implementation of the advanced aspects of the train derailment model using Bayesian Networks. 88 5.8 Results and discussions 92 5.9 Summary 93 CHAPTER 6: Bayesian Networks for risk-informed safety requirements for platform screen doors in railways 94 6.1 Introduction 94 6.2 Components of the risk-informed safety requirement process for Platform Screen Door system in a mega city 97 6.2.1 Define objective and methodology 97 6.2.2 Familiarization of system and information gathering 97 6.2.3 Hazard identification and hazard classification 97 6.2.4 Hazard scenario analysis 98 6.2.5 Probability of occurrence and failure data 99 6.2.6 Quantification of the risks 105 Tolerable risks 105 Risk exposure 105 Risk assessment 106 6.3 Summary 107 CHAPTER 7: Influence diagrams based decision support for railway level crossings 108 7.1 Introduction 108 7.2 Level crossing accidents in railways 109 7.3 A case study of railway level crossing 110 7.4 Characteristics of the railway level crossing under investigation 111 7.5 Life quality index applied to railway level crossing risk problem 115 7.6 Summary 119 CHAPTER 8: Conclusions and outlook 120 8.1 Summary and important contributions 120 8.2 Originality of the work 122 8.3 Outlook 122 BIBLIOGRAPHY 124 APPENDIX 1 131

