Global ETD Search

291	床墊產業之消費者網路口碑研究 / A Study on Consumer’s e-Word-of-Mouth in Taiwanese Mattress Industry 林天瀚 Unknown Date (has links) 隨著經濟成長，高度競爭的生活環境造成人們睡眠品質不佳或失眠，多半是深受入睡困難、淺眠和早醒等困擾。因此，對於現代人逐漸願意投資在寢具和床墊產品只求一夜好眠。床墊是為耐久財，生命週期長且消費者感知風險高，因此大多消費者在購買前會先進行相關資訊蒐集。網路的普及使得商品資訊在網路上公開，使得網路成為消費者購物前蒐集資訊的主要管道。而隨著web2.0的興起，消費者在網路上分享產品資訊的網路口碑有了爆炸性的增加，產生了讓廠商無法忽視的網路口碑效應。透過網路口碑的分析可以了解消費者與市場需求，提供廠商未來修正整體經營策略。本研究透過網路蒐集網路口碑進行內容分析法，瞭解消費者選購床墊的決策流程以及對國內外高價品牌床墊的評價。從研究結果可以發現，近四成的消費者是為了改善睡眠品質而產生購買床墊的需求。實體通路及網路是消費者主要蒐集資訊的管道。在評估床墊時，消費者重視床墊品質、價格以及軟硬度，他們在意床墊和彈簧的設計以及床墊軟硬度。而有近五成五的消費者是偏好較硬床墊。近九成九的購買管道為品牌床墊專賣店和一般床墊經銷通路居多。約1%的消費者由於網路購物的七天鑑賞期而選擇在網路購物平台購買。近七成五的消費者購買後表示滿意，近二成五的消費者購後仍不滿意床墊與彈簧的設計與材質，以及依舊腰酸背痛等。消費者對於國內外床墊品牌的正面口碑多於負面口碑，其中消費者對席伊麗品牌形象最佳，負面評論少。而消費者對本土床墊品牌，如：德泰彈簧床、老K牌彈簧床和10 Days的品質也有正面的評價，但對於品牌形象與經營則保持負面的態度，建議未來應針對加強品牌權益的目標，調整行銷策略。 / With economic growth, a highly competitive environment results people in poor quality sleep or insomnia. Most of them suffer from difficulty in falling asleep, light sleep and wake early. Therefore, people are more and more willing to invest in bedding, pillow and matress just for a good quality sleep. Due to the mattresses are durable goods with long product life cycle and perceived risk, consumers usually spend a lot of time to study the mattresses before purchasing. On the other hand, with the universal of the Internet and the growth of web2.0, the blooming e word-of-mouths (eWOM) makes consumer tend to uderstand products in advance by reading comments. That indeed cause a serious effect on eWOM for firms who should not ignore. Through analyzing the eWOM, firms could understand the trend on the demand side then modify their strategies. This study collected eWOM from the Internet and use content analysis to understand consumer purchase decision process as well as the consumers’ insights into upscale mattresses. From the results, we could know that almost 40% of consumers buy a new mattress for improving sleeping quality. They usually collect the necessary information by visiting physical channels and surfing on the Internet. They evaluate the mattresses quality, price, softness and firmness. They care about mattresses and springs’ design and matieral. Almost 55% of consumers prefer to buy a firm mattress. Almost 99% of consumers buy mattresses at the physical channels. Rest of consumers buy mattresses on e-commerce due to the 7-days return warranty. Almost 75% of consumers feel satisfied after purchasing. Rest of them are not satisfied at the mattresses and springs’ design and material and uncomfortableness. Consumers gave more positive comments than negative comments on the selected mattresses brands. Sealy had a reputation on its brand image and quality; King Koil, The-Tai Mattress, Kingbed, Tempur and 10 Days are all well-know for their quality as well, but still had some negative comments on the brand image and price. After the eWOM studies on each selected brand, this study gave suggestions to each brand for its future marketing strategy. 床墊產業網路口碑消費者行為消費者決策流程內容分析法 Mattress Industry e-Word-of-Mouth Consumer Behvior Consumer Purchase Decision Process Content Analysis
292	Generalized Sampling-Based Feedback Motion Planners Kumar, Sandip 2011 December 1900 (has links) The motion planning problem can be formulated as a Markov decision process (MDP), if the uncertainties in the robot motion and environments can be modeled probabilistically. The complexity of solving these MDPs grow exponentially as the dimension of the problem increases and hence, it is nearly impossible to solve the problem even without constraints. Using hierarchical methods, these MDPs can be transformed into a semi-Markov decision process (SMDP) which only needs to be solved at certain landmark states. In the deterministic robotics motion planning community, sampling based algorithms like probabilistic roadmaps (PRM) and rapidly exploring random trees (RRTs) have been successful in solving very high dimensional deterministic problem. However they are not robust to system with uncertainties in the system dynamics and hence, one of the primary objective of this work is to generalize PRM/RRT to solve motion planning with uncertainty. We first present generalizations of randomized sampling based algorithms PRM and RRT, to incorporate the process uncertainty, and obstacle location uncertainty, termed as "generalized PRM" (GPRM) and "generalized RRT" (GRRT). The controllers used at the lower level of these planners are feedback controllers which ensure convergence of trajectories while mitigating the effects of process uncertainty. The results indicate that the algorithms solve the motion planning problem for a single agent in continuous state/control spaces in the presence of process uncertainty, and constraints such as obstacles and other state/input constraints. Secondly, a novel adaptive sampling technique, termed as "adaptive GPRM" (AGPRM), is proposed for these generalized planners to increase the efficiency and overall success probability of these planners. It was implemented on high-dimensional robot n-link manipulators, with up to 8 links, i.e. in a 16-dimensional state-space. The results demonstrate the ability of the proposed algorithm to handle the motion planning problem for highly non-linear systems in very high-dimensional state space. Finally, a solution methodology, termed the "multi-agent AGPRM" (MAGPRM), is proposed to solve the multi-agent motion planning problem under uncertainty. The technique uses a existing solution technique to the multiple traveling salesman problem (MTSP) in conjunction with GPRM. For real-time implementation, an ?inter-agent collision detection and avoidance? module was designed which ensures that no two agents collide at any time-step. Algorithm was tested on teams of homogeneous and heterogeneous agents in cluttered obstacle space and the algorithm demonstrate the ability to handle such problems in continuous state/control spaces in presence of process uncertainty. Motion planning Probabilistic RoadMaps (PRM) Semi-Markov Decision Process (SMDP) uncertainty, adaptive sampling multi-agent motion planning problem non-holonomic heterogeneous agents motion uncertainty process uncertainty collision detection collision avoidance
293	Metodologia de disseny conceptual d'estacions depuradores d'aigües residuals que combina el procés de decisió jeràrquic amb l'anàlisi de decisions multicriteri Vidal Roberto, Núria 12 March 2004 (has links) La present tesi proposa una metodologia de disseny conceptual d'estacions depuradores d'aigües residuals (EDAR) mitjançant la combinació del procés de decisió jeràrquic i l'anàlisi de decisions multicriteri. El document s'inicia amb una breu introducció als principals camps abordats pel treball: el disseny dels processos químics en general, el disseny de les estacions depuradores d'aigües residuals en particular, i l'anàlisi de decisions multicriteri aplicada a la gestió ambiental. Seguidament, es fixen els objectius del treball i es descriuen tant la metodologia com el material de suport informàtic utilitzats. Per validar i contrastar la metodologia de disseny presentada, es desenvolupa un cas d'estudi on es porta a terme el disseny conceptual d'una EDAR que presenta els mateixos requeriments que l'EDAR que opera actualment al municipi de Granollers. Inicialment es presenta la informació de partida i tot seguit es defineixen els objectius de disseny, així com el conjunt de criteris que s'utilitzaran per avaluar en quina mesura es compleixen aquests objectius. Els objectius de disseny són de diferents tipus: ambientals, tècnics, socials i econòmics, i el conjunt de criteris utilitzats, concretament 33, també es classifica segons aquestes quatre categories. Cadascun dels criteris presenta un determinat pes d'importància relativa en la presa de decisions. Finalment, es desenvolupa tot el procés de decisió fins a obtenir el disseny complet de l'EDAR. El procés de decisió s'ha dividit en dues parts diferenciades però que alhora s'entrellacen: la línia d'aigua i la línia de fang. El procés de decisió presenta un total de divuit qüestions amb un màxim de quatre alternatives per pregunta (dotze qüestions corresponen a la línia d'aigua, i sis a la línia de fangs). Per solucionar cadascuna d'aquestes qüestions, s'avaluen les alternatives proposades respecte a un conjunt de criteris triats de la llista inicial. Aplicant el procés de decisió multicriteri anomenat SMART (simple multiattribute rating technique), es combinen els resultats de les alternatives respecte a cada criteri, tenint en compte la importància de cada criteri per obtenir un sol valor per alternativa. Per quantificar els criteris referents a l'operació del procés i les de tipus econòmic s'han utilitzat els programes GPS-X i CapdetWorks respectivament. Pel que fa als criteris no quantificats mitjançant aquests programes, s'han resolt de manera qualitativa i mitjançant manuals de disseny i també tenint en compte l'opinió d'experts en aquest camp. L'alternativa que obté un pes més elevat és la recomanada per al procés de decisió. El cas d'estudi finalitza un cop s'obté el disseny complet de l'EDAR. Per integrar tots aquests elements que hem esmentat i donar suport al desenvolupament del procés de decisió s'ha utilitzat el programa DRAMA (Design Rationale Management). A continuació, es fa una anàlisi comparativa entre l'EDAR que hi ha actualment al municipi de Granollers i l'EDAR resultat del cas d'estudi. Es descriu el diagrama de flux que conforma l'EDAR de Granollers i el diagrama de flux de l'EDAR resultat de l'estudi, se'n fa una anàlisi comparativa justificant cadascuna de les decisions preses en el cas d'estudi i, finalment, es fa una discussió de resultats on es reflecteixen els avantatges associats d'aplicar la metodologia de disseny conceptual proposada. Finalment, es presenten les conclusions de la tesi. Els principals resultats de la tesi es van publicar el 2002 a la revista internacional Industrial and Engineering Chemistry Research (N. Vidal, R. Bañares-Alcántara, I. Rodríguez-Roda i M. Poch: "Design of wastewater treatment plants using a conceptual design methodology", Industrial and Engineering Chemistry Research, 41 (20), pàg. 4993-5005) i la continuació de la línia de recerca al Laboratori d'Enginyeria Química i Ambiental de la UdG ha comportat la presentació del treball de recerca de Xavi Flores "Procés de decisió jeràrquic combinat amb anàlisi multicriteri per al suport al disseny conceptual de sistemes de fangs actius d'una estació depuradora d'aigües residuals" i la presentació dels resultats parcials al congrés internacional de la 9th IWA Conference on Design, Operation and Economics of Large Wastewater Treatment, que va tenir lloc el setembre passat a Praga ("Combining hierarchical decision process with multi-criteria analysis for conceptual design of WWTP", X. Flores, N. Vidal, A. Bonmatí, J. B. Copp i I. Rodríguez-Roda). / This thesis proposes a methodology for the conceptual design of wastewater treatment plants (WWTP) via a combination of a hierarchical decision process and multicriteria decision analysis. We begin with a brief introduction into the main fields of study involved in this work: the design of chemical processes in general; the design of wastewater treatment plants in particular and multicriteria decision analysis as applied to environmental management. We go on to outline the objectives of the study and then describe our methodology as well as the computer-based support tools used. In order to evaluate and contrast our design methodology, we present a study case in which a conceptual design is developed for a WWTP which has the same requirements as the actual WWTP currently in operation in the town of Granollers. First, we present the initial information and then define the design objectives, as well as the set of criteria that will be used in order to evaluate the degree to which the objectives have been met.The design objectives are of different types: environmental, technical, social and economical; the set of criteria, of which there are 33 in all, are also classified into these four categories. Each of the criteria has a specific weighting in terms of their importance in taking decisions. Next, we go through the whole decision process leading to the completion of the design of the WWTP. The decision process has been divided into two parts which are differentiated but at the same time, interconnected: the water line and the sludge line. The decision process involves a total of eighteen questions (twelve for the water line and six for the sludge line) with a maximum of four alternatives per question. In order to answer each question, the proposed alternatives are evaluated in relation to a set of criteria chosen from the initial list. Applying the multicriteria decision process known as SMART (simple multiattribute rating technique), the results for the alternatives with respect to each criteria are combined, bearing in mind the importance of each criteria, in order to get a single value for each alternative. To get the results for those alternatives relating to the operation of the process and those to do with economical factors, we used the GPS-X and CapdetWorks programmes, respectively. The criteria not quantified by means of these programmes were resolved by qualitative means and through the use of design manuals, in addition to taking expert opinion into account. The alternative that obtains the highest weighting is the one which is recommended by the decision process. The study case is completed once a complete design of the WWTP is obtained. In order to integrate all the elements we have mentioned, and to assist the development of the decision process, we employed the programme known as DRAMA (Design Rationale Management). We then go on to give a comparative analysis between the real WWTP in Granollers and the WWTP resulting from our study case. We describe and compare the flow diagrams in both cases, providing justification for each of the decisions taken in the study case and then discuss the results, reflecting on the advantages to be gained from using the methodology of conceptual design we propose. Finally, we present the conclusions of our thesis.The main results of this thesis were published, in 2002, in the international magazine, Industrial and Engineering Chemistry Research (N. Vidal, R. Bañares-Alcántara, I. Rodríguez-Roda and M. Poch: "Design of wastewater treatment plants using a conceptual design methodology", Industrial and Engineering Chemistry Research, 41 (20), pages 4993-5005). Continuing work in this line of research at the Laboratori d'Enginyeria Química i Ambiental at the University of Girona has led to the research work by Xavi Flores, Procés de decisió jeràrquic combinat amb anàlisi multicriteri per al suport al disseny conceptual de sistemes de fangs actius d'una estació depuradora d'aigües residuals (A hierarchical decision process combined with multicriteria analysis to assist conceptual design of active sludge systems in a wastewater treatment plant) and the presentation of the preliminary results at the 9th IWA Conference on Design, Operation and Economics of Large Wastewater Treatment, which took place in September, 2003, in Prague ("Combining hierarchical decision process with multi-criteria analysis for conceptual design of WWTP", X. Flores, Núria Vidal, August Bonmatí, J. B. Copp and Ignasi Rodríguez-Roda). WWTP Estacions depuradores Treatment plants Anàlisi de decisió multicriteri Conceptual design Multicriteria decision analysis Diseño conceptual Aigües residuals Aguas residuales Hierarchical decision process Disseny conceptual Plantas depuradoras Procés de decisió jeràrquic EDAR Proceso de decisión jerárquico Waster water 504 628
294	Aspects of the interplay of cognition and emotion and the use of verbal vs. numerical information decision making Trujillo Valencia, Carlos Andrés 27 June 2007 (has links) En ésta tesis se estudian 2 aspectos de la toma decisiones. Primero, se investiga la forma en que las personas categorizan atributos numéricos. Se presenta y se prueba experimentalmente un modelo del proceso mental que usan las personas para trasformar una cantidad en una categoría verbal. Bajo ciertas condiciones situacionales, el modelo es capaz de predecir conceptualizaciones verbales. Segundo, se exploran las interconexiones entre la información cognitiva y emocional durante la decisión. Se elaboran y se prueban experimentalmente cuatro modelos de la forma en que se combina la información cognitiva y emocional durante el proceso de elección, para determinar el valor de una alternativa. Los modelos muestran una alta capacidad de predicción. Esta varía en función de (1) la interacción de la información verbal y numérica con la capacidad cognitiva situacional del individuo y (2) la correlación entre los juicios cognitivos y las reacciones emocionales. / The present dissertation investigates two aspects of decision making: First, I study the way in which people understand and categorize numerical attributes of products. I develop and experimentally test a model of the mental process people use to transform a quantitative attribute into a verbal category. Under certain environmental conditions, the model is able to predict the verbal conceptualization of people. Second, I explore the interconnections of cognitive and emotional information during the process of decision making. I propose and experimentally test four different models of the way cognitive and affective information is combined during the decision making process in order to determine the value of an alternative. The models display a high predictive power. Their performance is influenced by (1) the interaction of verbal and numerical information with the situational cognitive capacities of the individual and (2) by the correlation of cognitive judgments and affective reactions. meta cognitive emotions self regulation decision process procedural preferences emotions and cognition fuzzy measures verbal and numerical information categorization consumer emociones meta cognitivas auto regulación procesos de decisión preferencias procedimentales emoción y cognición mediciones difusas información verbal y numérica categorización consumidor 159.9 334
295	The process and organisational consequences of new artefact adoption in surgery Johnstone, Patricia Lynne January 2001 (has links) Thesis (PhD)--Macquarie University, Macquarie Graduate School of Management, 2001. / Bibliography: leaves 288-310. / Introduction -- Introduction to research problem and methodology -- Study context -- Theoretical framework - Review of the literature -- Study design and methods -- Study sites, surgical procedures, and labour input to surgical production -- New intra-operative artefacts: goals, choices and consequences -- Conclusion. / Surgical technologies since the late 1980s have undergone substantial innovations that have involved ...the adoption of new machines, instruments, and related surgical materials... referred to throughtout this thesis as intra-operative artefacts... typically represents a commitment of substantial financial resources by the hospitals concerned. However, little is documented about the process whereby the decisions are made to adopt new intra-operative artefacts, and no previous research appears to have explored the work-related consequences of new intra-operative artefact adoption within operating theatre services. This thesis explores the reasons why new intra-operative artefacts are adopted, how the decisions are made, who are the participants in the decsion process and what are the expected and actual organisational consequences of new intra-operative artefact adoption. / Electronic reproduction. / xii, 347 leaves, bound : / Mode of access: World Wide Web. / Also available in print form Surgical technology -- Australia Hospitals -- Administration -- Australia Technological change Decision process Surgery Operating theatre work Organisational consequences
296	Distribution de Processus Décisionnels Markoviens pour une gestion prédictive d’une ressource partagée : application aux voies navigables des Hauts-de-France dans le contexte incertain du changement climatique / Distributing Markov Decision Processes for a predictive management of a shared resource : application to the Hauts-de-France waterways in the uncertain context of climate change Desquesnes, Guillaume, Louis, Florent 23 October 2018 (has links) Les travaux de cette thèse visent à mettre en place une gestion prédictive sous incertitudes de la ressource en eau pour les réseaux de voies navigables. L'objectif est de proposer un plan de gestion de l'eau pour optimiser les conditions de navigation de l'ensemble du réseau supervisé sur un horizon spécifié. La solution attendue doit rendre le réseau résilient aux effets probables du changement climatique et aux évolutions du trafic fluvial. Dans un premier temps, une modélisation générique d'une ressource distribuée sur un réseau est proposée. Celle-ci, basée sur les processus décisionnels markoviens, prend en compte les nombreuses incertitudes affectant les réseaux considérés. L'objectif de cette modélisation est de couvrir l'ensemble des cas possibles, prévus ou non, afin d'avoir une gestion résiliente de ces réseaux. La seconde contribution consiste en une distribution du modèle sur plusieurs agents afin de permettre son passage à l'échelle. Ceci consiste en une répartition des capacités de contrôle du réseau entre les agents. Chaque agent ne possède ainsi qu'une connaissance locale du réseau supervisé. De ce fait, les agents ont besoin de se cordonner pour proposer une gestion efficace du réseau. Une résolution itérative avec échanges de plans temporaires de chaque agent est utilisée pour l'obtention de politiques de gestion locales à chaque agent. Finalement, des expérimentations ont été réalisées sur des réseaux réels de voies navigables françaises pour observer la qualité des solutions produites. Plusieurs scénarios climatiques différents ont été simulés pour tester la résilience des politiques produites. / The work of this thesis aims to introduce and implement a predictive management under uncertainties of the water resource for inland waterway networks. The objective is to provide a water management plan to optimize the navigation conditions of the entire supervised network over a specified horizon. The expected solution must render the network resilient to probable effects of the climate change and changes in waterway traffic. Firstly, a generic modeling of a resource distributed on a network is proposed. This modeling, based on Markovian Decision Processes, takes into account the numerous uncertainties affecting considered networks. The objective of this modeling is to cover all possible cases, foreseen or not, in order to have a resilient management of those networks. The second contribution consists in a distribution of the model over several agents to facilitate the scaling. This consists of a repartition of the network's control capacities among the agents. Thus, each agent has only local knowledge of the supervised network. As a result, agents require coordination to provide an efficient management of the network. An iterative resolution, with exchanges of temporary plans from each agent, is used to obtain local management policies for each agent. Finally, experiments were carried out on realistic and real networks of the French waterways to observe the quality of the solutions produced. Several different climatic scenarios have been simulated to test the resilience of the produced policies. Intelligence artificielle Système multiagent Processus décisionnel de Markov Planification distribuée Coordination et coopération Réseaux de voies navigables Changement climatique Artificial Intelligence Multiagent system Markov Decision Process Distributed planning Coordination and cooperation Inland waterways networks Climate change 004
297	O processo de decisão de compra de varejista de papelaria: um estudo de caso sobre a sua decisão Chen, Hamilton 10 October 2007 (has links) Made available in DSpace on 2010-04-20T20:20:01Z (GMT). No. of bitstreams: 1 149577.pdf: 249444 bytes, checksum: 0886870205bbb58b6dd47f087f3d36ce (MD5) Previous issue date: 2007-10-10T00:00:00Z / A competição acirrada no mercado têm obrigado os varejistas a avaliar produtos com bastante critério antes de adquiri-los. Nas papelarias, o mesmo ocorre com a compra de instrumentos de escrita. Nesta dissertação é analisado o processo de compra de lapiseiras. Atualmente existe uma infinidade de modelos de lapiseiras com diferentes cores, estampas, cheiros, preços e fornecedores. Como escolher para que tenham alto giro no ponto-de-venda, minimizando os custos e maximizando os ganhos, torna-se o grande desafio para a organização que pretende continuar competitiva. Para aumentar o conhecimento sobre esse processo, esta dissertação teve como propósito investigar as variáveis que influenciam a decisão do comprador varejista, dono de papelaria. Foram realizados dois estudos de caso com papelarias. Por fim, descobriu-se que a decisão do comprador de lapiseiras não se restringe ao produto. Existem diversas variáveis que podem influenciar a sua decisão, como o representante, a distribuição efetiva, a garantia e o marketing (comunicação). Administração mercadológica Compras Lapiseiras Varejo Processo de decisão do consumidor Papelarias Retail Purchase Mechanical pencils Stationery stores Buyers decision process Administração de empresas Compras - Processo decisório Escrita - Materiais e instrumentos Papelaria - Administração
298	Nákupní chování a postoje zákazníků maloobchodní jednotky / Shopping behaviour and customers attitude related to selected retail outlet VÍTKOVÁ, Nikola January 2013 (has links) The theoretical section attempts to explain shoppers? behaviour in defining the terms of consuming and buying. The marketing research focused on customers planning of purchases or factors affecting the purchases without a plan and attempted to look at different groups of customers.
299	Förnuft och Känsla : En studie kring investmentbolags och riskkapitalbolags beslutsfattande vid förvärv av portföljbolag. / Sense and Sensibility : A study about decision-making in private equity firms focusing on long-term and short-term acquisitions of portfolio companies. Breidmer, Julia, Carlsson, Lisa January 2018 (has links) Syftet med denna studie är att öka kunskapen om beslutsprocessen och användningen av beslutsunderlag vid förvärv som genomförs av investmentbolag och riskkapitalbolag. Syftet uppfylls genom att kartlägga beslutsprocessen och identifiera beslutsunderlag vid förvärv av portföljbolag på lång respektive kort sikt, samt förklara de skillnader och likheter som går att urskilja mellan långsiktiga och kortsiktiga förvärv av portföljbolag vad gäller beslutsprocessen och dess beslutsunderlag. / The objective of this paper is to contribute with knowledge about the decision-making process and decision criterias in private equity firms focusing on long- term and short-term acquisitions of portfolio companies. The objective is met by mapping and identifying the decision-making process and its criterias in private equity firms carrying out long-term vs short-term investments. Further, relevant similarities and differences that can be distinguished are identified and explained. Decision-making Decision process Decision criterias Beslutsfattande Beslutsprocess Beslutsunderlag Förvärv Investmentbolag Riskkapitalbolag Lång sikt Kort sikt Business Administration Företagsekonomi
300	Optimization Algorithms for Deterministic, Stochastic and Reinforcement Learning Settings Joseph, Ajin George January 2017 (has links) (PDF) Optimization is a very important field with diverse applications in physical, social and biological sciences and in various areas of engineering. It appears widely in ma-chine learning, information retrieval, regression, estimation, operations research and a wide variety of computing domains. The subject is being deeply studied both theoretically and experimentally and several algorithms are available in the literature. These algorithms which can be executed (sequentially or concurrently) on a computing machine explore the space of input parameters to seek high quality solutions to the optimization problem with the search mostly guided by certain structural properties of the objective function. In certain situations, the setting might additionally demand for “absolute optimum” or solutions close to it, which makes the task even more challenging. In this thesis, we propose an optimization algorithm which is “gradient-free”, i.e., does not employ any knowledge of the gradient or higher order derivatives of the objective function, rather utilizes objective function values themselves to steer the search. The proposed algorithm is particularly effective in a black-box setting, where a closed-form expression of the objective function is unavailable and gradient or higher-order derivatives are hard to compute or estimate. Our algorithm is inspired by the well known cross entropy (CE) method. The CE method is a model based search method to solve continuous/discrete multi-extremal optimization problems, where the objective function has minimal structure. The proposed method seeks, in the statistical manifold of the parameters which identify the probability distribution/model defined over the input space to find the degenerate distribution concentrated on the global optima (assumed to be finite in quantity). In the early part of the thesis, we propose a novel stochastic approximation version of the CE method to the unconstrained optimization problem, where the objective function is real-valued and deterministic. The basis of the algorithm is a stochastic process of model parameters which is probabilistically dependent on the past history, where we reuse all the previous samples obtained in the process till the current instant based on discounted averaging. This approach can save the overall computational and storage cost. Our algorithm is incremental in nature and possesses attractive features such as stability, computational and storage efficiency and better accuracy. We further investigate, both theoretically and empirically, the asymptotic behaviour of the algorithm and find that the proposed algorithm exhibits global optimum convergence for a particular class of objective functions. Further, we extend the algorithm to solve the simulation/stochastic optimization problem. In stochastic optimization, the objective function possesses a stochastic characteristic, where the underlying probability distribution in most cases is hard to comprehend and quantify. This begets a more challenging optimization problem, where the ostentatious nature is primarily due to the hardness in computing the objective function values for various input parameters with absolute certainty. In this case, one can only hope to obtain noise corrupted objective function values for various input parameters. Settings of this kind can be found in scenarios where the objective function is evaluated using a continuously evolving dynamical system or through a simulation. We propose a multi-timescale stochastic approximation algorithm, where we integrate an additional timescale to accommodate the noisy measurements and decimate the eﬀects of the gratuitous noise asymptotically. We found that if the objective function and the noise involved in the measurements are well behaved and the timescales are compatible, then our algorithm can generate high quality solutions. In the later part of the thesis, we propose algorithms for reinforcement learning/Markov decision processes using the optimization techniques we developed in the early stage. MDP can be considered as a generalized framework for modelling planning under uncertainty. We provide a novel algorithm for the problem of prediction in reinforcement learning, i.e., estimating the value function of a given stationary policy of a model free MDP (with large state and action spaces) using the linear function approximation architecture. Here, the value function is defined as the long-run average of the discounted transition costs. The resource requirement of the proposed method in terms of computational and storage cost scales quadratically in the size of the feature set. The algorithm is an adaptation of the multi-timescale variant of the CE method proposed in the earlier part of the thesis for simulation optimization. We also provide both theoretical and empirical evidence to corroborate the credibility and effectiveness of the approach. In the final part of the thesis, we consider a modified version of the control problem in a model free MDP with large state and action spaces. The control problem most commonly addressed in the literature is to find an optimal policy which maximizes the value function, i.e., the long-run average of the discounted transition payoffs. The contemporary methods also presume access to a generative model/simulator of the MDP with the hidden premise that observations of the system behaviour in the form of sample trajectories can be obtained with ease from the model. In this thesis, we consider a modified version, where the cost function to be optimized is a real-valued performance function (possibly non-convex) of the value function. Additionally, one has to seek the optimal policy without presuming access to the generative model. In this thesis, we propose a stochastic approximation algorithm for this peculiar control problem. The only information, we presuppose, available to the algorithm is the sample trajectory generated using a priori chosen behaviour policy. The algorithm is data (sample trajectory) efficient, stable, robust as well as computationally and storage efficient. We provide a proof of convergence of our algorithm to a high performing policy relative to the behaviour policy. Optimization Algorithms Reinforcement Learning Machine Learning Markov Decision Process Stochastic Approximation Algorithm Stochastic Optimization Cross Entropy Method Stochastic Global Optimization Cross Entropy Optimization Method Quantile Estimation Continuous Optimization Computer Science

Search results