Global ETD Search

251	Online Learning and Simulation Based Algorithms for Stochastic Optimization Lakshmanan, K January 2012 (has links) (PDF) In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we have proposed a novel multi-time scale quasi-Newton based smoothed functional (QN-SF) algorithm for unconstrained as well as constrained optimization. The algorithm uses the smoothed functional scheme for estimating the gradient and the quasi-Newton method to solve the optimization problem. The algorithm is shown to converge with probability one. We have also provided here experimental results on the problem of optimal routing in a multi-stage network of queues. Policies like Join the Shortest Queue or Least Work Left assume knowledge of the queue length values that can change rapidly or hard to estimate. If the only information available is the expected end-to-end delay as with our case, such policies cannot be used. The QN-SF based probabilistic routing algorithm uses only the total end-to-end delay for tuning the probabilities. We observe from the experiments that the QN-SF algorithm has better performance than the gradient and Jacobi versions of Newton based smoothed functional algorithms. Next we consider constrained routing in a similar queueing network. We extend the QN-SF algorithm to this case. We study the convergence behavior of the algorithm and observe that the constraints are satisfied at the point of convergence. We provide experimental results for the constrained routing setup as well. Next we study reinforcement learning algorithms which are useful for solving Markov Decision Process(MDP) when the precise information on transition probabilities is not known. When the state, and action sets are very large, it is not possible to store all the state-action tuples. In such cases, function approximators like neural networks have been used. The popular Q-learning algorithm is known to diverge when used with linear function approximation due to the ’off-policy’ problem. Hence developing stable learning algorithms when used with function approximation is an important problem. We present in this thesis a variant of Q-learning with linear function approximation that is based on two-timescale stochastic approximation. The Q-value parameters for a given policy in our algorithm are updated on the slower timescale while the policy parameters themselves are updated on the faster scale. We perform a gradient search in the space of policy parameters. Since the objective function and hence the gradient are not analytically known, we employ the efficient one-simulation simultaneous perturbation stochastic approximation(SPSA) gradient estimates that employ Hadamard matrix based deterministic perturbations. Our algorithm has the advantage that, unlike Q-learning, it does not suffer from high oscillations due to the off-policy problem when using function approximators. Whereas it is difficult to prove convergence of regular Q-learning with linear function approximation because of the off-policy problem, we prove that our algorithm which is on-policy is convergent. Numerical results on a multi-stage stochastic shortest path problem show that our algorithm exhibits significantly better performance and is more robust as compared to Q-learning. Future work would be to compare it with other policy-based reinforcement learning algorithms. Finally, we develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process(MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multistage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point. Stochastic Approximation Algorithms Stochastic Optimization Markov Decision Process Reinforcement Learning Algorithm Queueing Networks Queuing Theory Online Q-Learning Algorithm Online Actor-Critic Algorithm Markov Decision Processes Q-learning Algorithm Linear Function Approximation Computer Science
252	Sartre, critique des poètes / Sartre, The critic of poets Salem, Bilel 07 November 2014 (has links) Ma thèse traite d’un aspect de la critique sartrienne : la critique poétique. Elle se présente sous forme de triptyque. En effet, chaque partie traite de la figure d’un poète. Dans les deux premières parties de ma thèse, j’aborde deux poètes du XIXème siècle : Baudelaire et Mallarmé. Les deux livres qui m’ont servi de support pour étudier cette critique poétique sont le Baudelaire de Sartre et Mallarmé, La lucidité et sa face d’ombre. Ces deux essais ont radicalement bouleversé la manière avec laquelle on appréhendait jusqu’à là la figure de ces deux poètes. Si le XIXème siècle en a fait des monstres sacrés qui ont apporté la nouveauté dans le genre poétique, Sartre quant à lui, sape certaines idées reçues. Baudelaire est le premier à qui il s’attaque en dénonçant son désengagement. Il critique son dandysme outrancier qui en a fait selon lui un poète stérile. Cet essai est aussi l’occasion pour Sartre d’exposer sa théorie de l’existentialisme et de montrer que l’Engagement et la Littérature vont de pair et illustrent la liberté de l’Homme. Dans la seconde partie qui traite de Mallarmé, la lucidité et sa face d’ombre, la critique poétique se mêle à la critique historique. Sartre commence par brosser un tableau de la société du XIXème siècle en mettant l’accent sur le désœuvrement de ce siècle. Mallarmé semble comme Baudelaire illustrer une certaine forme de désengagement. Pourtant Sartre semble omettre un élément essentiel, c’est que ces poètes de la deuxième moitié du XIXème siècle font partie de ce que l’on appelle « Les Héritiers de l’athéisme ». Mallarmé dévoile l’absence d’un Dieu en caressant l’idée du suicide. Celui-ci apparaît dans ses poèmes puisque le poète expérimente sa propre mort comme pour réaffirmer l’absence de Dieu. En conséquence, il existe une liberté inhérente à ces deux poètes que sont Baudelaire et Mallarmé, mais cette liberté est bien différente de la liberté sartrienne qui se conçoit comme un absolu. Enfin dans la troisième partie de la thèse, c’est Genet qui est à l’honneur. Sartre manifeste là toute son admiration pour ce génie créateur qui a su assumer pleinement ses choix et qui n’a cessé de revendiquer la singularité de son être. La conception que se fait Genet de l’existence se situe aux antipodes de l’attitude baudelairienne. Chez Genet, la poésie s’est imposée comme un acte libérateur. Sartre n’hésite pas à comparer parfois indirectement les poètes. En effet, à ses yeux Baudelaire ne s’est aucunement illustré dans le mal. Genet, lui, par contre a fait de ce mal une véritable splendeur. Il l’a célébré et a fini par l’incarner. En abordant la destinée singulière de trois poètes, Sartre illustre en même temps sa propre philosophie existentielle. Il démontre l’absence d’un Inconscient qui expliquerait toutes nos actions et réaffirme la liberté absolue de l’Homme. / My thesis deals with one aspect of Sartre's critic: the poetic criticism. It has three major parts. The first and the second parts of my thesis discuss two poets of the nineteenth century: Baudelaire and Mallarmé.Baudelaire and Mallarmé, La lucidité et sa face d’ombre represent two principals books which have been support my study. Both essays play a great role to change the way in which we thought about them before Sartre’s studies.The nineteenth century has made Baudelaire and Mallarmé as two most important poets, however Sartre brought innovation and tried to broke our popular belief. In the first part, Sartre has been denouncing Baudelaire’s disengagement.In the second part which deals with Mallarmé, la lucidité et sa face d’ombre,, Sartre describe the poets of second half of the nineteenth century as “The heirs of Atheism” . As a result, Sartre creates a new notion of freedom which is totally different from those of Mallarmé and Baudelaire. Finally, in the third part Sartre chose to express his admiration for Genet because he assumed his responsibility for his choice of being. Genet’s conception of existence is contradicted with that of Baudelaire.To crown it all, Sartre show his existential philosophy throughout these three poets of XIX and XX centuries. In relation to Sartre there is no Unconscious that would explain our actions. Consequently, he confirms the absolute freedom of Man. Athéisme Critique biographique Critique historique Critique littéraire Critique poétique Elu Engagement Existentialisme Inconscient Liberté Mal Néant Phénoménologie Poème Praxis Prose Psychanalyse Psychanalyse existentielle Situations Atheism Critical Biographical Historical Criticism Literary criticism Poetic Critic Elected Engagement Existentialism Unconscious Freedom Evil/Nothingness Phenomenology Poem Praxis Prose Psychoanalysis Psychoanalytic existential Situations
253	[pt] ESTUDO DE TÉCNICAS DE APRENDIZADO POR REFORÇO APLICADAS AO CONTROLE DE PROCESSOS QUÍMICOS / [en] STUDY OF REINFORCEMENT LEARNING TECHNIQUES APPLIED TO THE CONTROL OF CHEMICAL PROCESSES 30 December 2021 (has links) [pt] A indústria 4.0 impulsionou o desenvolvimento de novas tecnologias para atender as demandas atuais do mercado. Uma dessas novas tecnologias foi a incorporação de técnicas de inteligência computacional no cotidiano da indústria química. Neste âmbito, este trabalho avaliou o desempenho de controladores baseados em aprendizado por reforço em processos químicos industriais. A estratégia de controle interfere diretamente na segurança e no custo do processo. Quanto melhor for o desempenho dessa estrategia, menor será a produção de efluentes e o consumo de insumos e energia. Os algoritmos de aprendizado por reforço apresentaram excelentes resultados para o primeiro estudo de caso, o reator CSTR com a cinética de Van de Vusse. Entretanto, para implementação destes algoritmos na planta química do Tennessee Eastman Process mostrou-se que mais estudos são necessários. A fraca ou inexistente propriedade Markov, a alta dimensionalidade e as peculiaridades da planta foram fatores dificultadores para os controladores desenvolvidos obterem resultados satisfatórios. Foram avaliados para o estudo de caso 1, os algoritmos Q-Learning, Actor Critic TD, DQL, DDPG, SAC e TD3, e para o estudo de caso 2 foram avaliados os algoritmos CMA-ES, TRPO, PPO, DDPG, SAC e TD3. / [en] Industry 4.0 boosted the development of new technologies to meet current market demands. One of these new technologies was the incorporation of computational intelligence techniques into the daily life of the chemical industry. In this context, this present work evaluated the performance of controllers based on reinforcement learning in industrial chemical processes. The control strategy directly affects the safety and cost of the process. The better the performance of this strategy, the lower will be the production of effluents and the consumption of input and energy. The reinforcement learning algorithms showed excellent results for the first case study, the Van de Vusse s reactor. However, to implement these algorithms in the Tennessee Eastman Process chemical plant it was shown that more studies are needed. The weak Markov property, the high dimensionality and peculiarities of the plant were factors that made it difficult for the developed controllers to obtain satisfactory results. For case study 1, the algorithms Q-Learning, Actor Critic TD, DQL, DDPG, SAC and TD3 were evaluated, and for case study 2 the algorithms CMA-ES, TRPO, PPO, DDPG, SAC and TD3 were evaluated. [pt] APRENDIZADO POR REFORCO [pt] SAC [pt] TD3 [pt] DDPG [pt] DEEP Q-LEARNING [pt] ATOR-CRITICO [pt] REATOR DE VAN DE VUSSE [pt] CONTROLE DE PROCESSOS QUIMICOS [pt] APRENDIZADO POR REFORCO PROFUNDO [pt] Q-LEARNING [pt] PROCESSO TENNESSEE EASTMAN [en] REINFORCEMENT LEARNING [en] SAC [en] TD3 [en] DDPG [en] DEEP Q-LEARNING [en] ACTOR CRITIC [en] CHEMICAL PROCESS CONTROL [en] DEEP REINFORCEMENT LEARNING [en] Q-LEARNING [en] TENNESSEE EASTMAN PROCESS
254	Prediction of Protein-Protein Interactions Using Deep Learning Techniques Soleymani, Farzan 24 April 2023 (has links) Proteins are considered the primary actors in living organisms. Proteins mainly perform their functions by interacting with other proteins. Protein-protein interactions underpin various biological activities such as metabolic cycles, signal transduction, and immune response. PPI identification has been addressed by various experimental methods such as the yeast two-hybrid, mass spectrometry, and protein microarrays, to mention a few. However, due to the sheer number of proteins, experimental methods for finding interacting and non-interacting protein pairs are time-consuming and costly. Therefore a sequence-based framework called ProtInteract is developed to predict protein-protein interaction. ProtInteract comprises two components: first, a novel autoencoder architecture that encodes each protein's primary structure to a lower-dimensional vector while preserving its underlying sequential pattern by extracting uncorrelated attributes and more expressive descriptors. This leads to faster training of the second network, a deep convolutional neural network (CNN) that receives encoded proteins and predicts their interaction. Three different scenarios formulate the prediction task. In each scenario, the deep CNN predicts the class of a given encoded protein pair. Each class indicates different ranges of confidence scores corresponding to the probability of whether a predicted interaction occurs or not. The proposed framework features significantly low computational complexity and relatively fast response. The present study makes two significant contributions to the field of protein-protein interaction (PPI) prediction. Firstly, it addresses the computational challenges posed by the high dimensionality of protein datasets through the use of dimensionality reduction techniques, which extract highly informative sequence attributes. Secondly, the proposed framework, ProtInteract, utilises this information to identify the interaction characteristics of a protein based on its amino acid configuration. ProtInteract encodes the protein's primary structure into a lower-dimensional vector space, thereby reducing the computational complexity of PPI prediction. Our results provide evidence of the proposed framework's accuracy and efficiency in predicting protein-protein interactions. Long-short Term Memory Recurrent Neural Networks Protein-Protein Interaction Temporal Convolutional Network Convolutional Neural Network Autoencoder Reinforcement learning actor-critic portfolio management stock market prediction coverage control multi-agent system SARSA Q-learning Graph convolutional neural network GCN state-action-reward-state-action
255	Barnkulturens implicita förväntningar : En receptionsstudie av Suzanne Ostens verk Flickan, mamman och demonerna / The implicit expecation of children's culture : A reception study of Flickan, mamman och Demonerna by Suzanne Osten Gotfredsen, Maria January 2018 (has links) This thesis is a reception study of Susanne Ostens book, theatre play and children’s film Flickan, mamman och demonerna. The study aims to examine how proffesional crituiqe, published in daily press and journals, handles the crituiqes genre conventions with the theorie that if the cirtic averts from the crituiqes genre conventions that their/societys doxa on what we expect from children’s art will become more visible, and the inescapable follow up question: does this ecpectation mirror how we perceive children and their capabilities? The theorie is an investigation of the professor and litterary scholar Anders Johansson’s essay Slitas itu publisched Critica Obscura: Litteraturkritiska essäer, where Johansson makes the claim that socitys doxa will become visible if the ciric does not follow the crituiqes genre conventions and that this will lead to an erosion of the crituiqes porpose. Johanssons theory is acompanied by, amongst others, Jaquline Rose’s The Case of Peter Pan, Or, The Impossibility of Children's Fiction, and Perry Nodelmans article “The Other: Orientalism, Colonialism, and Children’s Literature” pulished in Children’s Literature Association Quarterly17, no, chosen since the thesis as a whole practices a post colonial reading of the chosen crituiqe. The result of the investigation is that: where the critic averts from the crituiques genre conventions, and involves their own oppinions without porper motivation (E.g. where affective failures has occurred) in the text, a clear pattern of our conteporary doxa of how we perceive children and how the art made for them should behave/be. The pattern/tendancy found is that children’s art is expected to have a didactic value, that the piece of art should mirror the childs “reality”, have an indisputable and understandable point, and not evoke too many negative feelings. The thesis is woven, in a esseyistic style, with hard-to-grasp memories of the authors childhood and the academic form. post colonial theory suzanne osten crituiqe critic reception study anders johansson perry nodelman jacquline rose the verbal icon Wimsatt literature essay Litteratursociologi Suzanne Osten kritik litteraturkritik receptionsstudie anders johansson postkolonial postkolonial teori the verbal icon Wimsatt perry nodelman essä General Literature Studies Litteraturvetenskap

Page generated in 0.046 seconds