Global ETD Search

101	Apprentissage de stratégies de calcul adaptatives pour les réseaux neuronaux profonds Kamanda, Aton 07 1900 (has links) La théorie du processus dual stipule que la cognition humaine fonctionne selon deux modes distincts : l’un pour le traitement rapide, habituel et associatif, appelé communément "système 1" et le second, ayant un traitement plus lent, délibéré et contrôlé, que l’on nomme "système 2". Cette distinction indique une caractéristique sous-jacente importante de la cognition humaine : la possibilité de passer de manière adaptative à différentes stratégies de calcul selon la situation. Cette capacité est étudiée depuis longtemps dans différents domaines et de nombreux bénéfices hypothétiques semblent y être liés. Cependant, les réseaux neuronaux profonds sont souvent construits sans cette capacité à gérer leurs ressources calculatoires de manière optimale. Cette limitation des modèles actuels est d’autant plus préoccupante que de plus en plus de travaux récents semblent montrer une relation linéaire entre la capacité de calcul utilisé et les performances du modèle lors de la phase d’évaluation. Pour résoudre ce problème, ce mémoire propose différentes approches et étudie leurs impacts sur les modèles, tout d’abord, nous étudions un agent d’apprentissage par renforcement profond qui est capable d’allouer plus de calcul aux situations plus difficiles. Notre approche permet à l’agent d’adapter ses ressources computationnelles en fonction des exigences de la situation dans laquelle il se trouve, ce qui permet en plus d’améliorer le temps de calcul, améliore le transfert entre des tâches connexes et la capacité de généralisation. L’idée centrale commune à toutes nos approches est basée sur les théories du coût de l’effort venant de la littérature sur le contrôle cognitif qui stipule qu’en rendant l’utilisation de ressource cognitive couteuse pour l’agent et en lui laissant la possibilité de les allouer lors de ses décisions il va lui-même apprendre à déployer sa capacité de calcul de façon optimale. Ensuite, nous étudions des variations de la méthode sur une tâche référence d’apprentissage profond afin d’analyser précisément le comportement du modèle et quels sont précisément les bénéfices d’adopter une telle approche. Nous créons aussi notre propre tâche "Stroop MNIST" inspiré par le test de Stroop utilisé en psychologie afin de valider certaines hypothèses sur le comportement des réseaux neuronaux employant notre méthode. Nous finissons par mettre en lumière les liens forts qui existent entre apprentissage dual et les méthodes de distillation des connaissances. Notre approche a la particularité d’économiser des ressources computationnelles lors de la phase d’inférence. Enfin, dans la partie finale, nous concluons en mettant en lumière les contributions du mémoire, nous détaillons aussi des travaux futurs, nous approchons le problème avec les modèles basés sur l’énergie, en apprenant un paysage d’énergie lors de l’entrainement, le modèle peut ensuite lors de l’inférence employer une capacité de calcul dépendant de la difficulté de l’exemple auquel il fait face plutôt qu’une simple propagation avant fixe ayant systématiquement le même coût calculatoire. Bien qu’ayant eu des résultats expérimentaux infructueux, nous analysons les promesses que peuvent tenir une telle approche et nous émettons des hypothèses sur les améliorations potentielles à effectuer. Nous espérons, avec nos contributions, ouvrir la voie vers des algorithmes faisant un meilleur usage de leurs ressources computationnelles et devenant par conséquent plus efficace en termes de coût et de performance, ainsi que permettre une compréhension plus intime des liens qui existent entre certaines méthodes en apprentissage machine et la théorie du processus dual. / The dual-process theory states that human cognition operates in two distinct modes: one for rapid, habitual and associative processing, commonly referred to as "system 1", and the second, with slower, deliberate and controlled processing, which we call "system 2". This distinction points to an important underlying feature of human cognition: the ability to switch adaptively to different computational strategies depending on the situation. This ability has long been studied in various fields, and many hypothetical benefits seem to be linked to it. However, deep neural networks are often built without this ability to optimally manage their computational resources. This limitation of current models is all the more worrying as more and more recent work seems to show a linear relationship between the computational capacity used and model performance during the evaluation phase. To solve this problem, this thesis proposes different approaches and studies their impact on models. First, we study a deep reinforcement learning agent that is able to allocate more computation to more difficult situations. Our approach allows the agent to adapt its computational resources according to the demands of the situation in which it finds itself, which in addition to improving computation time, enhances transfer between related tasks and generalization capacity. The central idea common to all our approaches is based on cost-of-effort theories from the cognitive control literature, which stipulate that by making the use of cognitive resources costly for the agent, and allowing it to allocate them when making decisions, it will itself learn to deploy its computational capacity optimally. We then study variations of the method on a reference deep learning task, to analyze precisely how the model behaves and what the benefits of adopting such an approach are. We also create our own task "Stroop MNIST" inspired by the Stroop test used in psychology to validate certain hypotheses about the behavior of neural networks employing our method. We end by highlighting the strong links between dual learning and knowledge distillation methods. Finally, we approach the problem with energy-based models, by learning an energy landscape during training, the model can then during inference employ a computational capacity dependent on the difficulty of the example it is dealing with rather than a simple fixed forward propagation having systematically the same computational cost. Despite unsuccessful experimental results, we analyze the promise of such an approach and speculate on potential improvements. With our contributions, we hope to pave the way for algorithms that make better use of their computational resources, and thus become more efficient in terms of cost and performance, as well as providing a more intimate understanding of the links that exist between certain machine learning methods and dual process theory. Apprentissage par renforcement profond Théorie du processus dual Efficacité computationnelle Apprentissage profond Efficacité computationnelle Distillation des connaissances Modèles basés sur l’énergie Contrôle cognitif Deep learning Deep reinforcement learning Dual process theory Computational efficiency Knowledge distillation Energy-based models Cognitive control
102	Reducing uncertainty in new product development Higgins, Paul Anthony January 2008 (has links) Research and Development engineering is at the corner stone of humanity’s evolution. It is perceived to be a systematic creative process which ultimately improves the living standard of a society through the creation of new applications and products. The commercial paradigm that governs project selection, resource allocation and market penetration prevails when the focus shifts from pure research to applied research. Furthermore, the road to success through commercialisation is difficult for most inventors, especially in a vast and isolated country such as Australia which is located a long way from wealthy and developed economies. While market leading products are considered unique, the actual process to achieve these products is essentially the same; progressing from an idea, through development to an outcome (if successful). Unfortunately, statistics indicate that only 3% of ‘ideas’ are significantly successful, 4% are moderately successful, and the remainder ‘evaporate’ in that form (Michael Quinn, Chairman, Innovation Capital Associates Pty Ltd). This study demonstrates and analyses two techniques developed by the author which reduce uncertainty in the engineering design and development phase of new product development and therefore increase the probability of a successful outcome. This study expands the existing knowledge of the engineering design and development stage in the new product development process and is couched in the identification of practical methods, which have been successfully used to develop new products by Australian Small Medium Enterprise (SME) Excel Technology Group Pty Ltd (ETG). Process theory is the term most commonly used to describe scientific study that identifies occurrences that result from a specified input state to an output state, thus detailing the process used to achieve an outcome. The thesis identifies relevant material and analyses recognised and established engineering processes utilised in developing new products. The literature identified that case studies are a particularly useful method for supporting problem-solving processes in settings where there are no clear answers or where problems are unstructured, as in New Product Development (NPD). This study describes, defines, and demonstrates the process of new product development within the context of historical product development and a ‘live’ case study associated with an Australian Government START grant awarded to Excel Technology Group in 2004 to assist in the development of an image-based vehicle detection product. This study proposes two techniques which reduce uncertainty and thereby improve the probability of a successful outcome. The first technique provides a predicted project development path or forward engineering plan which transforms the initial ‘fuzzy idea’ into a potential and achievable outcome. This process qualifies the ‘fuzzy idea’ as a potential, rationale or tangible outcome which is within the capability of the organisation. Additionally, this process proposes that a tangible or rationale idea can be deconstructed in reverse engineering process in order to create a forward engineering development plan. A detailed structured forward engineering plan reduces the uncertainty associated with new product development unknowns and therefore contributes to a successful outcome. This is described as the RETRO technique. The study recognises however that this claim requires qualification and proposes a second technique. The second technique proposes that a two dimensional spatial representation which has productivity and consumed resources as its axes, provides an effective means to qualify progress and expediently identify variation from the predicted plan. This spatial representation technique allows a quick response which in itself has a prediction attribute associated with directing the project back onto its predicted path. This process involves a coterminous comparison between the predicted development path and the evolving actual project development path. A consequence of this process is verification of progress or the application of informed, timely and quantified corrective action. This process also identifies the degree of success achieved in the engineering design and development phase of new product development where success is defined as achieving a predicted outcome. This spatial representation technique is referred to as NPD Mapping. The study demonstrates that these are useful techniques which aid SMEs in achieving successful new product outcomes because the technique are easily administered, measure and represent relevant development process related elements and functions, and enable expedient quantified responsive action when the evolving path varies from the predicted path. These techniques go beyond time line representations as represented in GANTT charts and PERT analysis, and represent the base variables of consumed resource and productivity/technical achievement in a manner that facilitates higher level interpretation of time, effort, degree of difficulty, and product complexity in order to facilitate informed decision making. This study presents, describes, analyses and demonstrates an SME focused engineering development technique, developed by the author, that produces a successful new product outcome which begins with a ‘fuzzy idea’ in the mind of the inventor and concludes with a successful new product outcome that is delivered on time and within budget. Further research on a wider range of SME organisations undertaking new product development is recommended.

Search results

Apprentissage de stratégies de calcul adaptatives pour les réseaux neuronaux profonds

Reducing uncertainty in new product development