Global ETD Search

1	Distributed space-time block coding in cooperative relay networks with application in cognitive radio Alotaibi, Faisal T. January 2012 (has links) Spatial diversity is an effective technique to combat the effects of severe fading in wireless environments. Recently, cooperative communications has emerged as an attractive communications paradigm that can introduce a new form of spatial diversity which is known as cooperative diversity, that can enhance system reliability without sacrificing the scarce bandwidth resource or consuming more transmit power. It enables single-antenna terminals in a wireless relay network to share their antennas to form a virtual antenna array on the basis of their distributed locations. As such, the same diversity gains as in multi-input multi-output systems can be achieved without requiring multiple-antenna terminals. In this thesis, a new approach to cooperative communications via distributed extended orthogonal space-time block coding (D-EO-STBC) based on limited partial feedback is proposed for cooperative relay networks with three and four relay nodes and then generalized for an arbitrary number of relay nodes. This scheme can achieve full cooperative diversity and full transmission rate in addition to array gain, and it has certain properties that make it alluring for practical systems such as orthogonality, flexibility, low computational complexity and decoding delay, and high robustness to node failure. Versions of the closed-loop D-EO-STBC scheme based on cooperative orthogonal frequency division multiplexing type transmission are also proposed for both flat and frequency-selective fading channels which can overcome imperfect synchronization in the network. As such, this proposed technique can effectively cope with the effects of fading and timing errors. Moreover, to increase the end-to-end data rate, this scheme is extended for two-way relay networks through a three-time slot framework. On the other hand, to substantially reduce the feedback channel overhead, limited feedback approaches based on parameter quantization are proposed. In particular, an optimal one-bit partial feedback approach is proposed for the generalized D-O-STBC scheme to maximize the array gain. To further enhance the end-to-end bit error rate performance of the cooperative relay system, a relay selection scheme based on D-EO-STBC is then proposed. Finally, to highlight the utility of the proposed D-EO-STBC scheme, an application to cognitive radio is studied. 621.382
2	Multi-armed bandits with unconventional feedback / Bandits multi-armés avec rétroaction partielle Gajane, Pratik 14 November 2017 (has links) Dans cette thèse, nous étudions des problèmes de prise de décisions séquentielles dans lesquels, pour chacune de ses décisions, l'apprenant reçoit une information qu'il utilise pour guider ses décisions futures. Pour aller au-delà du retour d’information conventionnel tel qu'il a été bien étudié pour des problèmes de prise de décision séquentielle tels que les bandits multi-bras, nous considérons des formes de retour d’information partielle motivées par des applications pratiques.En premier, nous considérons le problème des bandits duellistes, dans lequel l'apprenant sélectionne deux actions à chaque pas de temps et reçoit en retour une information relative (i.e. de préférence) entre les valeurs instantanées de ces deux actions.En particulier, nous proposons un algorithme optimal qui permet à l'apprenant d'obtenir un regret cumulatif quasi-optimal (le regret est la différence entre la récompense cumulative optimale et la récompense cumulative constatée de l’apprenant). Dans un second temps, nous considérons le problème des bandits corrompus, dans lequel un processus de corruption stochastique perturbe le retour d’information. Pour ce problème aussi, nous concevons des algorithmes pour obtenir un regret cumulatif asymptotiquement optimal. En outre, nous examinons la relation entre ces deux problèmes dans le cadre du monitoring partiel qui est un paradigme générique pour la prise de décision séquentielle avec retour d'information partielle. / The multi-armed bandit (MAB) problem is a mathematical formulation of the exploration-exploitation trade-off inherent to reinforcement learning, in which the learner chooses an action (symbolized by an arm) from a set of available actions in a sequence of trials in order to maximize their reward. In the classical MAB problem, the learner receives absolute bandit feedback i.e. it receives as feedback the reward of the arm it selects. In many practical situations however, different kind of feedback is more readily available. In this thesis, we study two of such kinds of feedbacks, namely, relative feedback and corrupt feedback.The main practical motivation behind relative feedback arises from the task of online ranker evaluation. This task involves choosing the optimal ranker from a finite set of rankers using only pairwise comparisons, while minimizing the comparisons between sub-optimal rankers. This is formalized by the MAB problem with relative feedback, in which the learner selects two arms instead of one and receives the preference feedback. We consider the adversarial formulation of this problem which circumvents the stationarity assumption over the mean rewards for the arms. We provide a lower bound on the performance measure for any algorithm for this problem. We also provide an algorithm called "Relative Exponential-weight algorithm for Exploration and Exploitation" with performance guarantees. We present a thorough empirical study on several information retrieval datasets that confirm the validity of these theoretical results.The motivating theme behind corrupt feedback is that the feedback the learner receives is a corrupted form of the corresponding reward of the selected arm. Practically such a feedback is available in the tasks of online advertising, recommender systems etc. We consider two goals for the MAB problem with corrupt feedback: best arm identification and exploration-exploitation. For both the goals, we provide lower bounds on the performance measures for any algorithm. We also provide various algorithms for these settings. The main contribution of this module is the algorithms "KLUCB-CF" and "Thompson Sampling-CF" which asymptotically attain the best possible performance. We present experimental results to demonstrate the performance of these algorithms. We also show how this problem setting can be used for the practical application of enforcing differential privacy. Bandits Multi-Bras Retour D’information Partielle Dueling Bandits Corrupt Bandits Évaluation du Ranker Vie Privée Différentielle Multi-Armed Bandit Partial Feedback Dueling Bandits Corrupt Bandits Ranker Evaluation Differential Privacy

Search results

Distributed space-time block coding in cooperative relay networks with application in cognitive radio

Multi-armed bandits with unconventional feedback / Bandits multi-armés avec rétroaction partielle