Global ETD Search

61	Assessing and improving recommender systems to deal with user cold-start problem Paixão, Crícia Zilda Felício 06 March 2017 (has links) Sistemas de recomendação fazem parte do nosso dia-a-dia. Os métodos usados nesses sistemas tem como objetivo principal predizer as preferências por novos itens baseado no perĄl do usuário. As pesquisas relacionadas a esse tópico procuram entre outras coisas tratar o problema do cold-start do usuário, que é o desaĄo de recomendar itens para usuários que possuem poucos ou nenhum registro de preferências no sistema. Uma forma de tratar o cold-start do usuário é buscar inferir as preferências dos usuários a partir de informações adicionais. Dessa forma, informações adicionais de diferentes tipos podem ser exploradas nas pesquisas. Alguns estudos usam informação social combinada com preferências dos usuários, outros se baseiam nos clicks ao navegar por sites Web, informação de localização geográĄca, percepção visual, informação de contexto, etc. A abordagem típica desses sistemas é usar informação adicional para construir um modelo de predição para cada usuário. Além desse processo ser mais complexo, para usuários full cold-start (sem preferências identiĄcadas pelo sistema) em particular, a maioria dos sistemas de recomendação apresentam um baixo desempenho. O trabalho aqui apresentado, por outro lado, propõe que novos usuários receberão recomendações mais acuradas de modelos de predição que já existem no sistema. Nesta tese foram propostas 4 abordagens para lidar com o problema de cold-start do usuário usando modelos existentes nos sistemas de recomendação. As abordagens apresentadas trataram os seguintes aspectos: o Inclusão de informação social em sistemas de recomendação tradicional: foram investigados os papéis de várias métricas sociais em um sistema de recomendação de preferências pairwise fornecendo subsidíos para a deĄnição de um framework geral para incluir informação social em abordagens tradicionais. o Uso de similaridade por percepção visual: usando a similaridade por percepção visual foram inferidas redes, conectando usuários similares, para serem usadas na seleção de modelos de predição para novos usuários. o Análise dos benefícios de um framework geral para incluir informação de redes de usuários em sistemas de recomendação: representando diferentes tipos de informação adicional como uma rede de usuários, foi investigado como as redes de usuários podem ser incluídas nos sistemas de recomendação de maneira a beneĄciar a recomendação para usuários cold-start. o Análise do impacto da seleção de modelos de predição para usuários cold-start: a última abordagem proposta considerou que sem a informação adicional o sistema poderia recomendar para novos usuários fazendo a troca entre os modelos já existentes no sistema e procurando aprender qual seria o mais adequado para a recomendação. As abordagens propostas foram avaliadas em termos da qualidade da predição e da qualidade do ranking em banco de dados reais e de diferentes domínios. Os resultados obtidos demonstraram que as abordagens propostas atingiram melhores resultados que os métodos do estado da arte. / Recommender systems are in our everyday life. The recommendation methods have as main purpose to predict preferences for new items based on userŠs past preferences. The research related to this topic seeks among other things to discuss user cold-start problem, which is the challenge of recommending to users with few or no preferences records. One way to address cold-start issues is to infer the missing data relying on side information. Side information of different types has been explored in researches. Some studies use social information combined with usersŠ preferences, others user click behavior, location-based information, userŠs visual perception, contextual information, etc. The typical approach is to use side information to build one prediction model for each cold user. Due to the inherent complexity of this prediction process, for full cold-start user in particular, the performance of most recommender systems falls a great deal. We, rather, propose that cold users are best served by models already built in system. In this thesis we propose 4 approaches to deal with user cold-start problem using existing models available for analysis in the recommender systems. We cover the follow aspects: o Embedding social information into traditional recommender systems: We investigate the role of several social metrics on pairwise preference recommendations and provide the Ąrst steps towards a general framework to incorporate social information in traditional approaches. o Improving recommendation with visual perception similarities: We extract networks connecting users with similar visual perception and use them to come up with prediction models that maximize the information gained from cold users. o Analyzing the beneĄts of general framework to incorporate networked information into recommender systems: Representing different types of side information as a user network, we investigated how to incorporate networked information into recommender systems to understand the beneĄts of it in the context of cold user recommendation. o Analyzing the impact of prediction model selection for cold users: The last proposal consider that without side information the system will recommend to cold users based on the switch of models already built in system. We evaluated the proposed approaches in terms of prediction quality and ranking quality in real-world datasets under different recommendation domains. The experiments showed that our approaches achieve better results than the comparison methods. / Tese (Doutorado) Computação Percepção visual Usuários da internet Sistemas de recomendação social Sistema de recomendação Preferências dos usuários Problema do cold-start do usuário Multi-armed bandits Recommender system User preferences Cold-start user problem Social recommender systems Visual perception
62	Learning-based Attack and Defense on Recommender Systems Agnideven Palanisamy Sundar (11190282) 06 August 2021 (has links) The internet is the home for massive volumes of valuable data constantly being created, making it difficult for users to find information relevant to them. In recent times, online users have been relying on the recommendations made by websites to narrow down the options. Online reviews have also become an increasingly important factor in the final choice of a customer. Unfortunately, attackers have found ways to manipulate both reviews and recommendations to mislead users. A Recommendation System is a special type of information filtering system adapted by online vendors to provide suggestions to their customers based on their requirements. Collaborative filtering is one of the most widely used recommendation systems; unfortunately, it is prone to shilling/profile injection attacks. Such attacks alter the recommendation process to promote or demote a particular product. On the other hand, many spammers write deceptive reviews to change the credibility of a product/service. This work aims to address these issues by treating the review manipulation and shilling attack scenarios independently. For the shilling attacks, we build an efficient Reinforcement Learning-based shilling attack method. This method reduces the uncertainty associated with the item selection process and finds the most optimal items to enhance attack reach while treating the recommender system as a black box. Such practical online attacks open new avenues for research in building more robust recommender systems. When it comes to review manipulations, we introduce a method to use a deep structure embedding approach that preserves highly nonlinear structural information and the dynamic aspects of user reviews to identify and cluster the spam users. It is worth mentioning that, in the experiment with real datasets, our method captures about 92\% of all spam reviewers using an unsupervised learning approach.<br> Computer System Security Computer Communications Networks Recommender Systems Machine Learning Deep Learning Reinforcement Learning Fraud Detection Fake Reviews Shilling Attacks Graph Embedding Multi-Armed Bandits
63	Machine Learning Algorithms for Influence Maximization on Social Networks Abhishek Kumar Umrawal (16787802) 08 August 2023 (has links) <p>With an increasing number of users spending time on social media platforms and engaging with family, friends, and influencers within communities of interest (such as in fashion, cooking, gaming, etc.), there are significant opportunities for marketing firms to leverage word-of-mouth advertising on these platforms. In particular, marketing firms can select sets of influencers within relevant communities to sponsor, namely by providing free product samples to those influencers so that so they will discuss and promote the product on their social media accounts.</p><p>The question of which set of influencers to sponsor is known as <b>influence maximization</b> (IM) formally defined as follows: "if we can try to convince a subset of individuals in a social network to adopt a new product or innovation, and the goal is to trigger a large cascade of further adoptions, which set of individuals should we target?'' Under standard diffusion models, this optimization problem is known to be NP-hard. This problem has been widely studied in the literature and several approaches for solving it have been proposed. Some approaches provide near-optimal solutions but are costly in terms of runtime. On the other hand, some approaches are faster but heuristics, i.e., do not have approximation guarantees.</p><p>In this dissertation, we study the influence maximization problem extensively. We provide efficient algorithms for solving the original problem and its important generalizations. Furthermore, we provide theoretical guarantees and experimental evaluations to support the claims made in this dissertation.</p><p>We first study the original IM problem referred to as the discrete influence maximization (DIM) problem where the marketer can either provide a free sample to an influencer or not, i.e., they cannot give fractional discounts like 10% off, etc. As already mentioned the existing solution methods (for instance, the simulation-based greedy algorithm) provide near-optimal solutions that are costly in terms of runtime and the approaches that are faster do not have approximation guarantees. Motivated by the idea of addressing this trade-off between accuracy and runtime, we propose a community-aware divide-and-conquer framework to provide a time-efficient solution to the DIM problem. The proposed framework outperforms the standard methods in terms of runtime and the heuristic methods in terms of influence.</p><p>We next study a natural extension of the DIM problem referred to as the fractional influence maximization (FIM) problem where the marketer may offer fractional discounts (as opposed to either providing a free sample to an influencer or not in the DIM problem) to the influencers. Clearly, the FIM problem provides more flexibility to the marketer in allocating the available budget among different influencers. The existing solution methods propose to use a continuous extension of the simulation-based greedy approximation algorithm for solving the DIM problem. This continuous extension suggests greedily building the solution for the given fractional budget by taking small steps through the interior of the feasible region. On the contrary, we first characterize the solution to the FIM problem in terms of the solution to the DIM problem. We then use this characterization to propose an efficient greedy approximation algorithm that only iterates through the corners of the feasible region. This leads to huge savings in terms of runtime compared to the existing methods that suggest iterating through the interior of the feasible region. Furthermore, we provide an approximation guarantee for the proposed greedy algorithm to solve the FIM problem.</p><p>Finally, we study another extension of the DIM problem referred to as the online discrete influence maximization (ODIM) problem, where the marketer provides free samples not just once but repeatedly over a given time horizon and the goal is to maximize the cumulative influence over time while receiving instantaneous feedback. The existing solution methods are based on semi-bandit instantaneous feedback where the knowledge of some intermediate aspects of how the influence propagates in the social network is assumed or observed. For instance, which specific individuals became influenced at the intermediate steps during the propagation? However, for social networks with user privacy, this information is not available. Hence, we consider the ODIM problem with full-bandit feedback where no knowledge of the underlying social network or diffusion process is assumed. We note that the ODIM problem is an instance of the stochastic combinatorial multi-armed bandit (CMAB) problem with submodular rewards. To solve the ODIM problem, we provide an efficient algorithm that outperforms the existing methods in terms of influence, and time and space complexities.</p><p>Furthermore, we point out the connections of influence maximization with a related problem of disease outbreak prevention and a more general problem of submodular maximization. The methods proposed in this dissertation can also be used to solve those problems.</p> Industrial engineering Reinforcement learning Operations research Optimisation Social networks Viral marketing Influence maximization Submodular maximization Discrete influence maximization Community detection Fractional influence maximization Partial incentives Online discrete influence maximization Combinatorial multi-armed bandits
64	Reference Tracking with Adversarial Adaptive Output- Feedback Model Predictive Control Bui, Linda January 2021 (has links) Model Predictive Control (MPC) is a control strategy based on optimization that handles system constraints explicitly, making it a popular feedback control method in real industrial processes. However, designing this control policy is an expensive operation since an explicit model of the process is required when re-tuning the controller. Another common practical challenge is that not all states are available, which calls for an observer in order to estimate the states, and imposes additional challenges such as satisfying the constraints and conditions that follow. This thesis attempts to address these challenges by extending the novel Adversarial Adaptive Model Predictive Control (AAMPC) algorithm with output-feedback for linear plants without explicit identification. The AAMPC algorithm is an adaptive MPC framework, where results from an adversarial Multi-Armed Bandit (MAB) are applied to a basic model predictive control formulation. The algorithm of the project, Adversarial Adaptive Output-Feedback Model Predictive Control (AAOFMPC), is derived by extending the standard MPC formulation with output-feedback, i.e, to an Output-Feedback Model Predictive Control (OFMPC) scheme, where a Kalman filter is implemented as the observer. Furthermore, the control performance of the extended algorithm is demonstrated with the problem of driving the state to a given reference, in which the performance is evaluated in terms of regret, state estimation errors, and how well the states track their given reference. Experiments are conducted on two discrete-time Linear Time- Invariant (LTI) systems, a second order system and a third order system, that are perturbed with different noise sequences. It is shown that the AAOFMPC performance satisfies the given theoretical bounds and constraints despite larger perturbations. However, it is also shown that the algorithm is not very robust against noise since offsets from the reference values for the state trajectories are observed. Furthermore, there are several tuning parameters of AAOFMPC that need further investigation for optimal performance. / Modell Prediktiv Reglering (MPC) är en optimeringsbaserad reglertekniksmetod som hanterar processbegränsingar på ett systematiskt sätt, vilket gör den till en populär metod inom återkopplad reglering i processindustrin. Denna metod medför dock höga beräkningskostnader eftersom det krävs en explicit modell varje gång regulatorn justeras online. I praktiken är det också vanligt att alla tillståndsvariabler inte är tillgängliga, vilket kräver en observatör för att rekonstruera alla tillståndsvariabler. Detta leder till fler utmaningar som att uppfylla ytterligare systembegränsingar och villkor som följer. Detta projekt adresserar dessa utmaningar genom att förlänga den nya algoritmen Adversarial Adaptiv Modell Prediktiv Reglering (AAMPC) med output-feedback för linjära system utan explicit modellidentifiering. AAMPC-algoritmen är en adaptiv reglerstrategi där resultat från en adversarial multiarmed bandit (MAB) appliceras i en standard MPC-formulering. Denna MPC-formulering är förlängd med output-feedback dvs. Output-Feedback Modell Predktiv Reglering (OFMPC) där ett Kalman filter är implementerad som en observatör och resulterar i projektets algoritm: Adversarial Adaptiv Output- Feedback Modell Prediktiv Reglering (AAOFMPC). Vidare demonstreras den utökade algoritmens prestanda med problemet att driva tillståndsvariablerna till ett givet referensvärde, där prestandan evalueras i termer av regret, skattningsfel och hur väl tillståndsvariablerna följer de givna referensvärdena. Experiment utförs på två tidsdiskreta tidsinvarianta (LTI) system, ett andraordningssystem och ett tredjeordningssystem, som är perturberade med olika värden av brus. Resultaten visar att AAOFMPC:s prestanda uppfyller de givna teoretiska begränsningarna trots större störningar. Det visar sig dock att algoritmen inte är särskilt robust mot brus eftersom det sker avvikelser från de givna referensvärdena för tillståndsvariablerna. Dessutom finns det flera parametrar i algoritmen som kräver ytterligare utredningar för optimal prestanda. Model Predictive Control Adversarial Multi-Armed Bandits Kalman Filter Output-Feedback Adaptive Control Modell Prediktiv Reglering Kontradiktoriska Flerarmade Banditer Kalman Filter Output-Feedback Adaptiv Reglering Computer and Information Sciences Data- och informationsvetenskap
65	Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches de Curtò i Díaz, Joaquim 23 January 2024 (has links) Tesis por compendio / [ES] El advenimiento de los Large Language Models (LLMs) marca una fase transformadora en el campo de la Inteligencia Artificial (IA), significando el cambio hacia sistemas inteligentes y autónomos capaces de una comprensión y toma de decisiones complejas. Esta tesis profundiza en las capacidades multifacéticas de los LLMs, explorando sus posibles aplicaciones en la optimización de decisiones, la comprensión de escenas y tareas avanzadas de resumen de video en diversos contextos. En el primer segmento de la tesis, el foco está en la comprensión semántica de escenas de Vehículos Aéreos No Tripulados (UAVs). La capacidad de proporcionar instantáneamente datos de alto nivel y señales visuales sitúa a los UAVs como plataformas ideales para realizar tareas complejas. El trabajo combina el potencial de los LLMs, los Visual Language Models (VLMs), y los sistemas de detección objetos de última generación para ofrecer descripciones de escenas matizadas y contextualmente precisas. Se presenta una implementación práctica eficiente y bien controlada usando microdrones en entornos complejos, complementando el estudio con métricas de legibilidad estandarizadas propuestas para medir la calidad de las descripciones mejoradas por los LLMs. Estos avances podrían impactar significativamente en sectores como el cine, la publicidad y los parques temáticos, mejorando las experiencias de los usuarios de manera exponencial. El segundo segmento arroja luz sobre el problema cada vez más crucial de la toma de decisiones bajo incertidumbre. Utilizando el problema de Multi-Armed Bandits (MAB) como base, el estudio explora el uso de los LLMs para informar y guiar estrategias en entornos dinámicos. Se postula que el poder predictivo de los LLMs puede ayudar a elegir el equilibrio correcto entre exploración y explotación basado en el estado actual del sistema. A través de pruebas rigurosas, la estrategia informada por los LLMs propuesta demuestra su adaptabilidad y su rendimiento competitivo frente a las estrategias convencionales. A continuación, la investigación se centra en el estudio de las evaluaciones de bondad de ajuste de las Generative Adversarial Networks (GANs) utilizando la Signature Transform. Al proporcionar una medida eficiente de similitud entre las distribuciones de imágenes, el estudio arroja luz sobre la estructura intrínseca de las muestras generadas por los GANs. Un análisis exhaustivo utilizando medidas estadísticas como las pruebas de Kruskal-Wallis proporciona una comprensión más amplia de la convergencia de los GANs y la bondad de ajuste. En la sección final, la tesis introduce un nuevo benchmark para la síntesis automática de vídeos, enfatizando la integración armoniosa de los LLMs y la Signature Transform. Se propone un enfoque innovador basado en los componentes armónicos capturados por la Signature Transform. Las medidas son evaluadas extensivamente, demostrando ofrecer una precisión convincente que se correlaciona bien con el concepto humano de un buen resumen. Este trabajo de investigación establece a los LLMs como herramientas poderosas para abordar tareas complejas en diversos dominios, redefiniendo la optimización de decisiones, la comprensión de escenas y las tareas de resumen de video. No solo establece nuevos postulados en las aplicaciones de los LLMs, sino que también establece la dirección para futuros trabajos en este emocionante y rápidamente evolucionante campo. / [CA] L'adveniment dels Large Language Models (LLMs) marca una fase transformadora en el camp de la Intel·ligència Artificial (IA), significat el canvi cap a sistemes intel·ligents i autònoms capaços d'una comprensió i presa de decisions complexes. Aquesta tesi profunditza en les capacitats multifacètiques dels LLMs, explorant les seues possibles aplicacions en l'optimització de decisions, la comprensió d'escenes i tasques avançades de resum de vídeo en diversos contexts. En el primer segment de la tesi, el focus està en la comprensió semàntica d'escenes de Vehicles Aeris No Tripulats (UAVs). La capacitat de proporcionar instantàniament dades d'alt nivell i senyals visuals situa els UAVs com a plataformes ideals per a realitzar tasques complexes. El treball combina el potencial dels LLMs, els Visual Language Models (VLMs), i els sistemes de detecció d'objectes d'última generació per a oferir descripcions d'escenes matisades i contextualment precises. Es presenta una implementació pràctica eficient i ben controlada usant microdrons en entorns complexos, complementant l'estudi amb mètriques de llegibilitat estandarditzades proposades per a mesurar la qualitat de les descripcions millorades pels LLMs. Aquests avenços podrien impactar significativament en sectors com el cinema, la publicitat i els parcs temàtics, millorant les experiències dels usuaris de manera exponencial. El segon segment arroja llum sobre el problema cada vegada més crucial de la presa de decisions sota incertesa. Utilitzant el problema dels Multi-Armed Bandits (MAB) com a base, l'estudi explora l'ús dels LLMs per a informar i guiar estratègies en entorns dinàmics. Es postula que el poder predictiu dels LLMs pot ajudar a triar l'equilibri correcte entre exploració i explotació basat en l'estat actual del sistema. A través de proves rigoroses, l'estratègia informada pels LLMs proposada demostra la seua adaptabilitat i el seu rendiment competitiu front a les estratègies convencionals. A continuació, la recerca es centra en l'estudi de les avaluacions de bondat d'ajust de les Generative Adversarial Networks (GANs) utilitzant la Signature Transform. En proporcionar una mesura eficient de similitud entre les distribucions d'imatges, l'estudi arroja llum sobre l'estructura intrínseca de les mostres generades pels GANs. Una anàlisi exhaustiva utilitzant mesures estadístiques com les proves de Kruskal-Wallis proporciona una comprensió més àmplia de la convergència dels GANs i la bondat d'ajust. En la secció final, la tesi introdueix un nou benchmark per a la síntesi automàtica de vídeos, enfatitzant la integració harmònica dels LLMs i la Signature Transform. Es proposa un enfocament innovador basat en els components harmònics capturats per la Signature Transform. Les mesures són avaluades extensivament, demostrant oferir una precisió convincent que es correlaciona bé amb el concepte humà d'un bon resum. Aquest treball de recerca estableix els LLMs com a eines poderoses per a abordar tasques complexes en diversos dominis, redefinint l'optimització de decisions, la comprensió d'escenes i les tasques de resum de vídeo. No solament estableix nous postulats en les aplicacions dels LLMs, sinó que també estableix la direcció per a futurs treballs en aquest emocionant i ràpidament evolucionant camp. / [EN] The advent of Large Language Models (LLMs) marks a transformative phase in the field of Artificial Intelligence (AI), signifying the shift towards intelligent and autonomous systems capable of complex understanding and decision-making. This thesis delves deep into the multifaceted capabilities of LLMs, exploring their potential applications in decision optimization, scene understanding, and advanced summarization tasks in diverse contexts. In the first segment of the thesis, the focus is on Unmanned Aerial Vehicles' (UAVs) semantic scene understanding. The capability of instantaneously providing high-level data and visual cues positions UAVs as ideal platforms for performing complex tasks. The work combines the potential of LLMs, Visual Language Models (VLMs), and state-of-the-art detection pipelines to offer nuanced and contextually accurate scene descriptions. A well-controlled, efficient practical implementation of microdrones in challenging settings is presented, supplementing the study with proposed standardized readability metrics to gauge the quality of LLM-enhanced descriptions. This could significantly impact sectors such as film, advertising, and theme parks, enhancing user experiences manifold. The second segment brings to light the increasingly crucial problem of decision-making under uncertainty. Using the Multi-Armed Bandit (MAB) problem as a foundation, the study explores the use of LLMs to inform and guide strategies in dynamic environments. It is postulated that the predictive power of LLMs can aid in choosing the correct balance between exploration and exploitation based on the current state of the system. Through rigorous testing, the proposed LLM-informed strategy showcases its adaptability and its competitive performance against conventional strategies. Next, the research transitions into studying the goodness-of-fit assessments of Generative Adversarial Networks (GANs) utilizing the Signature Transform. By providing an efficient measure of similarity between image distributions, the study sheds light on the intrinsic structure of the samples generated by GANs. A comprehensive analysis using statistical measures, such as the test Kruskal-Wallis, provides a more extensive understanding of the GAN convergence and goodness of fit. In the final section, the thesis introduces a novel benchmark for automatic video summarization, emphasizing the harmonious integration of LLMs and Signature Transform. An innovative approach grounded in the harmonic components captured by the Signature Transform is put forth. The measures are extensively evaluated, proving to offer compelling accuracy that correlates well with the concept of a good summary. This research work establishes LLMs as powerful tools in addressing complex tasks across diverse domains, redefining decision optimization, scene understanding, and summarization tasks. It not only breaks new ground in the applications of LLMs but also sets the direction for future work in this exciting and rapidly evolving field. / De Curtò I Díaz, J. (2023). Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/202200 / Compendio Autonomous Systems Artificial Intelligence Large Language Models Visual Language Models Unmanned Aerial Vehicles Semantic Scene Understanding Multi-Armed Bandits Signature Transform Generative Adversarial Networks (GANs)
66	Statistical Design of Sequential Decision Making Algorithms Chi-hua Wang (12469251) 27 April 2022 (has links) <p>Sequential decision-making is a fundamental class of problem that motivates algorithm designs of online machine learning and reinforcement learning. Arguably, the resulting online algorithms have supported modern online service industries for their data-driven real-time automated decision making. The applications span across different industries, including dynamic pricing (Marketing), recommendation (Advertising), and dosage finding (Clinical Trial). In this dissertation, we contribute fundamental statistical design advances for sequential decision-making algorithms, leaping progress in theory and application of online learning and sequential decision making under uncertainty including online sparse learning, finite-armed bandits, and high-dimensional online decision making. Our work locates at the intersection of decision-making algorithm designs, online statistical machine learning, and operations research, contributing new algorithms, theory, and insights to diverse fields including optimization, statistics, and machine learning.</p> <p><br></p> <p>In part I, we contribute a theoretical framework of continuous risk monitoring for regularized online statistical learning. Such theoretical framework is desirable for modern online service industries on monitoring deployed model's performance of online machine learning task. In the first project (Chapter 1), we develop continuous risk monitoring for the online Lasso procedure and provide an always-valid algorithm for high-dimensional dynamic pricing problems. In the second project (Chapter 2), we develop continuous risk monitoring for online matrix regression and provide new algorithms for rank-constrained online matrix completion problems. Such theoretical advances are due to our elegant interplay between non-asymptotic martingale concentration theory and regularized online statistical machine learning.</p> <p><br></p> <p>In part II, we contribute a bootstrap-based methodology for finite-armed bandit problems, termed Residual Bootstrap exploration. Such a method opens a possibility to design model-agnostic bandit algorithms without problem-adaptive optimism-engineering and instance-specific prior-tuning. In the first project (Chapter 3), we develop residual bootstrap exploration for multi-armed bandit algorithms and shows its easy generalizability to bandit problems with complex or ambiguous reward structure. In the second project (Chapter 4), we develop a theoretical framework for residual bootstrap exploration in linear bandit with fixed action set. Such methodology advances are due to our development of non-asymptotic theory for the bootstrap procedure.</p> <p><br></p> <p>In part III, we contribute application-driven insights on the exploration-exploitation dilemma for high-dimensional online decision-making problems. Such insights help practitioners to implement effective high-dimensional statistics methods to solve online decisionmaking problems. In the first project (Chapter 5), we develop a bandit sampling scheme for online batch high-dimensional decision making, a practical scenario in interactive marketing, and sequential clinical trials. In the second project (Chapter 6), we develop a bandit sampling scheme for federated online high-dimensional decision-making to maintain data decentralization and perform collaborated decisions. These new insights are due to our new bandit sampling design to address application-driven exploration-exploitation trade-offs effectively. </p> Statistics Decision Making Sequential decision making LASSO regression models Bandit Algorithms high-dimensional statistics bootstrap resampling method exploration-exploitation trade-off Dynamic pricing multi armed bandit Contextual bandits regularization method Online Machine Learning statistical learning methods martingale

Page generated in 0.0502 seconds