Global ETD Search

1	High performance Cholesky and symmetric indefinite factorizations with applications Hogg, Jonathan David January 2010 (has links) The process of factorizing a symmetric matrix using the Cholesky (LLT ) or indefinite (LDLT ) factorization of A allows the efficient solution of systems Ax = b when A is symmetric. This thesis describes the development of new serial and parallel techniques for this problem and demonstrates them in the setting of interior point methods. In serial, the effects of various scalings are reported, and a fast and robust mixed precision sparse solver is developed. In parallel, DAG-driven dense and sparse factorizations are developed for the positive definite case. These achieve performance comparable with other world-leading implementations using a novel algorithm in the same family as those given by Buttari et al. for the dense problem. Performance of these techniques in the context of an interior point method is assessed. 518
2	Ρευστομηχανική και grid Κωνσταντινίδης, Νικόλαος 30 April 2014 (has links) Η ανάγκη για την επίλυση μεγάλων προβλημάτων και η εξέλιξη της τεχνολογίας του διαδικτύου, είχε ως αποτέλεσμα την διαρκή ανάγκη για την εύρεση όλο και περισσότερων πόρων. Η ανάγκη αυτή οδήγησε στην δημιουργία δομών συνεργαζόμενων υπολογιστικών συστημάτων, με απώτερο σκοπό την επίλυση προβλημάτων που απαιτούν μεγάλη υπολογιστική ισχύ ή την αποθήκευση μεγάλου όγκου δεδομένων. Η ύπαρξη τέτοιων δομών αλλά και κεντρικών μονάδων επεξεργασίας με περισσότερους από έναν επεξεργαστές, δημιούργησε πρωτόκολλα για την δημιουργία εφαρμογών που θα εκτελούνται και θα επιλύουν ένα πρόβλημα σε περισσότερους από έναν επεξεργαστές, ώστε να επιτευχθεί η μείωση του χρόνου εκτέλεσης. Ένα παράδειγμα τέτοιου πρωτοκόλλου είναι αυτό της ανταλλαγής μηνυμάτων (MPI). Σκοπός της παρούσας διπλωματικής εργασίας είναι η τροποποίηση μιας υπάρχουσας εφαρμογή, που απαιτεί σημαντική υπολογιστική ισχύ, με σκοπό την εκμετάλλευση συστημάτων όπως αυτά που περιγράφηκαν προηγούμενα. Μέσα από αυτή την διαδικασία θα γίνει ανάλυση των πλεονεκτημάτων και των μειονεκτημάτων του παράλληλου προγραμματισμού. / The need to solve large problems and the development of internet technology, has resulted in the need to find more and more resources. This need led to the creation of structures collaborating systems, with a view to solving problems that require large computing power or storage of large amounts of data. The existence of such structures and central processing units with more than one processor, created protocols for the develop applications that will run and will solve a problem in more than one processor in order to achieve the reduction in execution time. An example of such a protocol is that of messaging (MPI). The purpose of this diploma thesis is to modify an existing application that requires significant computing power to exploit systems such as those described above. Through this process will analyze the advantages and disadvantages of parallel programming. Πλέγματα υπολογιστών Παραγοντοποίηση Cholesky 005.276 Grid computing Parallel processing Cholesky factorization
3	Modificações na fatoração controlada de Cholesky para acelerar o precondicionamento de sistemas lineares no contexto de pontos interiores / Modifications on controlled Cholesky factorization to improve the preconditioning in interior point method Silva, Lino Marcos da, 1978- 09 February 2014 (has links) Orientador: Aurelio Ribeiro Leite de Oliveira / Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica / Made available in DSpace on 2018-08-25T19:56:24Z (GMT). No. of bitstreams: 1 Silva_LinoMarcosda_D.pdf: 2297954 bytes, checksum: 2213b987c2753edec9152998b30b7c74 (MD5) Previous issue date: 2014 / Resumo: O método de pontos interiores para programação linear resolve em poucas iterações problemas de grande porte. No entanto, requer a cada iteração a resolução de dois sistemas lineares, os quais possuem a mesma matriz de coeficientes. Essa etapa se constitui no passo mais caro do método por aumentar consideravelmente o tempo de processamento e a necessidade de armazenamento de dados. Reduzir o tempo de solução dos sistemas lineares é, portanto, uma forma de melhorar o desempenho do método. De um modo geral, problemas de programação linear de grande porte possuem matrizes esparsas. Uma vez que os sistemas lineares a serem resolvidos são simétricos positivos definidos, métodos iterativos como o método dos gradientes conjugados precondicionado podem ser utilizados na resolução dos mesmos. Além disso, fatores de Cholesky incompletos podem ser utilizados como precondicionadores para o problema. Por outro lado, fatorações incompletas podem sofrer falhas na diagonal durante o processo de fatoração, e quando tais falhas ocorrem uma correção é efetuada somando-se um valor positivo aos elementos da diagonal da matriz do sistema linear e a fatoração da nova matriz é reiniciada, aumentando dessa forma o tempo de precondicionamento, quer seja devido a reconstrução do precondicionador, quer seja devido a perda de qualidade do mesmo. O precondicionador fatoração controlada de Cholesky tem um bom desempenho nas iterações iniciais do método de pontos interiores e tem sido importante nas implementações de abordagens de precondicionamento híbrido. No entanto, sendo uma fatoração incompleta, o mesmo não está livre da ocorrência de falhas no cálculo do pivô. Neste estudo propomos duas modificações à fatoração controlada de Cholesky a fim de evitar ou diminuir o número de reinícios da fatoração das matrizes diagonalmente modificadas. Resultados computacionais mostram que a técnica pode reduzir significativamente o tempo de resolução de certas classes de problemas de programação linear via método de pontos interiores / Abstract: The interior point method solves large linear programming problems in few iterations. However, each iteration requires computing the solution of one or more linear systems. This constitutes the most expensive step of the method by greatly increasing the processing time and the need for data storage. According to it, reducing the time to solve the linear system is a way of improving the method performance. In general, large linear programming problems have sparse matrices. Since the linear systems to be solved are symmetric positive definite, iterative methods such as the preconditioned conjugate gradient method can be used to solve them. Furthermore, incomplete Cholesky factor can be used as a preconditioner to the problem. On the other hand, breakdown may occur during incomplete factorizations. When such failure occur, a correction is made by adding a positive number to diagonal elements of the linear system matrix and the factorization of the new matrix is restarted, thus increasing the time of preconditioning, either due to computing the preconditioner, or due to loss of its quality. The controlled Cholesky factorization preconditioner performs well in early iterations of interior point methods and has been important on implementations of hybrid preconditioning approaches. However, being an incomplete factorization, it is not free from faulty pivots. In this study we propose two modifications to the controlled Cholesky factorization in order to avoid or decrease the refactoring diagonally modified matrices number. Computational results show that the proposed techniques can significantly reduces the time for solving linear programming problems by interior point method / Doutorado / Matematica Aplicada / Doutor em Matemática Aplicada Pré-condicionadores Métodos de pontos interiores Fatoração de Cholesky incompleta Preconditioners Incomplete Cholesky factorization Interior point method
4	Recursive Blocked Algorithms, Data Structures, and High-Performance Software for Solving Linear Systems and Matrix Equations Jonsson, Isak January 2003 (has links) <p>This thesis deals with the development of efficient and reliable algorithms and library software for factorizing matrices and solving matrix equations on high-performance computer systems. The architectures of today's computers consist of multiple processors, each with multiple functional units. The memory systems are hierarchical with several levels, each having different speed and size. The practical peak performance of a system is reached only by considering all of these characteristics. One portable method for achieving good system utilization is to express a linear algebra problem in terms of level 3 BLAS (Basic Linear Algebra Subprogram) transformations. The most important operation is GEMM (GEneral Matrix Multiply), which typically defines the practical peak performance of a computer system. There are efficient GEMM implementations available for almost any platform, thus an algorithm using this operation is highly portable.</p><p>The dissertation focuses on how recursion can be applied to solve linear algebra problems. Recursive linear algebra algorithms have the potential to automatically match the size of subproblems to the different memory hierarchies, leading to much better utilization of the memory system. Furthermore, recursive algorithms expose level 3 BLAS operations, and reveal task parallelism. The first paper handles the Cholesky factorization for matrices stored in packed format. Our algorithm uses a recursive packed matrix data layout that enables the use of high-performance matrix--matrix multiplication, in contrast to the standard packed format. The resulting library routine requires half the memory of full storage, yet the performance is better than for full storage routines.</p><p>Paper two and tree introduce recursive blocked algorithms for solving triangular Sylvester-type matrix equations. For these problems, recursion together with superscalar kernels produce new algorithms that give 10-fold speedups compared to existing routines in the SLICOT and LAPACK libraries. We show that our recursive algorithms also have a significant impact on the execution time of solving unreduced problems and when used in condition estimation. By recursively splitting several problem dimensions simultaneously, parallel algorithms for shared memory systems are obtained. The fourth paper introduces a library---RECSY---consisting of a set of routines implemented in Fortran 90 using the ideas presented in paper two and three. Using performance monitoring tools, the last paper evaluates the possible gain in using different matrix blocking layouts and the impact of superscalar kernels in the RECSY library. </p> Recursive algorithm recursive blocked format Cholesky factorization Sylvester-type equations automatic blocking superscalar GEMM-based RECSY library Computer engineering Datorteknik
5	Recursive Blocked Algorithms, Data Structures, and High-Performance Software for Solving Linear Systems and Matrix Equations Jonsson, Isak January 2003 (has links) This thesis deals with the development of efficient and reliable algorithms and library software for factorizing matrices and solving matrix equations on high-performance computer systems. The architectures of today's computers consist of multiple processors, each with multiple functional units. The memory systems are hierarchical with several levels, each having different speed and size. The practical peak performance of a system is reached only by considering all of these characteristics. One portable method for achieving good system utilization is to express a linear algebra problem in terms of level 3 BLAS (Basic Linear Algebra Subprogram) transformations. The most important operation is GEMM (GEneral Matrix Multiply), which typically defines the practical peak performance of a computer system. There are efficient GEMM implementations available for almost any platform, thus an algorithm using this operation is highly portable. The dissertation focuses on how recursion can be applied to solve linear algebra problems. Recursive linear algebra algorithms have the potential to automatically match the size of subproblems to the different memory hierarchies, leading to much better utilization of the memory system. Furthermore, recursive algorithms expose level 3 BLAS operations, and reveal task parallelism. The first paper handles the Cholesky factorization for matrices stored in packed format. Our algorithm uses a recursive packed matrix data layout that enables the use of high-performance matrix--matrix multiplication, in contrast to the standard packed format. The resulting library routine requires half the memory of full storage, yet the performance is better than for full storage routines. Paper two and tree introduce recursive blocked algorithms for solving triangular Sylvester-type matrix equations. For these problems, recursion together with superscalar kernels produce new algorithms that give 10-fold speedups compared to existing routines in the SLICOT and LAPACK libraries. We show that our recursive algorithms also have a significant impact on the execution time of solving unreduced problems and when used in condition estimation. By recursively splitting several problem dimensions simultaneously, parallel algorithms for shared memory systems are obtained. The fourth paper introduces a library---RECSY---consisting of a set of routines implemented in Fortran 90 using the ideas presented in paper two and three. Using performance monitoring tools, the last paper evaluates the possible gain in using different matrix blocking layouts and the impact of superscalar kernels in the RECSY library. Recursive algorithm recursive blocked format Cholesky factorization Sylvester-type equations automatic blocking superscalar GEMM-based RECSY library Computer engineering Datorteknik
6	Memory-aware Algorithms and Scheduling Techniques for Matrix Computattions / Algorithmes orientés mémoire et techniques d'ordonnancement pour le calcul matriciel Herrmann, Julien 25 November 2015 (has links) Dans cette thèse, nous nous sommes penchés d’un point de vue à la foisthéorique et pratique sur la conception d’algorithmes et detechniques d’ordonnancement adaptées aux architectures complexes dessuperordinateurs modernes. Nous nous sommes en particulier intéressésà l’utilisation mémoire et la gestion des communications desalgorithmes pour le calcul haute performance (HPC). Nous avonsexploité l’hétérogénéité des superordinateurs modernes pour améliorerles performances du calcul matriciel. Nous avons étudié lapossibilité d’alterner intelligemment des étapes de factorisation LU(plus rapide) et des étapes de factorisation QR (plus stablenumériquement mais plus deux fois plus coûteuses) pour résoudre unsystème linéaire dense. Nous avons amélioré les performances desystèmes d’exécution dynamique à l’aide de pré-calculs statiquesprenants en compte l’ensemble du graphe de tâches de la factorisationCholesky ainsi que l’hétérogénéité de l’architecture. Nous noussommes intéressés à la complexité du problème d’ordonnancement degraphes de tâches utilisant de gros fichiers d’entrée et de sortiesur une architecture hétérogène avec deux types de ressources,utilisant chacune une mémoire spécifique. Nous avons conçu denombreuses heuristiques en temps polynomial pour la résolution deproblèmes généraux que l’on avait prouvés NP-complet aupréalable. Enfin, nous avons conçu des algorithmes optimaux pourordonnancer un graphe de différentiation automatique sur uneplateforme avec deux types de mémoire : une mémoire gratuite maislimitée et une mémoire coûteuse mais illimitée. / Throughout this thesis, we have designed memory-aware algorithms and scheduling techniques suitedfor modern memory architectures. We have shown special interest in improving the performance ofmatrix computations on multiple levels. At a high level, we have introduced new numerical algorithmsfor solving linear systems on large distributed platforms. Most of the time, these linear solvers rely onruntime systems to handle resources allocation and data management. We also focused on improving thedynamic schedulers embedded in these runtime systems by adding static information to their decisionprocess. We proposed new memory-aware dynamic heuristics to schedule workflows, that could beimplemented in such runtime systems.Altogether, we have dealt with multiple state-of-the-art factorization algorithms used to solve linearsystems, like the LU, QR and Cholesky factorizations. We targeted different platforms ranging frommulticore processors to distributed memory clusters, and worked with several reference runtime systemstailored for these architectures, such as P A RSEC and StarPU. On a theoretical side, we took specialcare of modelling convoluted hierarchical memory architectures. We have classified the problems thatare arising when dealing with these storage platforms. We have designed many efficient polynomial-timeheuristics on general problems that had been shown NP-complete beforehand. Ordonnancement multi-critère Algorithmes numériques Factorisation LU Factorisation QR Factorisation Cholesky Calcul haute performance Systèmes linéaires Différentiation automatique Scheduling Numerical algorithms LU factorization QR factorization Cholesky factorization High performance computing Linear systems Automatic differentiation
7	A Runtime Framework for Regular and Irregular Message-Driven Parallel Applications on GPU Systems Rengasamy, Vasudevan January 2014 (has links) (PDF) The effective use of GPUs for accelerating applications depends on a number of factors including effective asynchronous use of heterogeneous resources, reducing data transfer between CPU and GPU, increasing occupancy of GPU kernels, overlapping data transfers with computations, reducing GPU idling and kernel optimizations. Overcoming these challenges require considerable effort on the part of the application developers. Most optimization strategies are often proposed and tuned specifically for individual applications. Message-driven executions with over-decomposition of tasks constitute an important model for parallel programming and provide multiple benefits including communication-computation overlap and reduced idling on resources. Charm++ is one such message-driven language which employs over decomposition of tasks, computation-communication overlap and a measurement-based load balancer to achieve high CPU utilization. This research has developed an adaptive runtime framework for efficient executions of Charm++ message-driven parallel applications on GPU systems. In the first part of our research, we have developed a runtime framework, G-Charm with the focus primarily on optimizing regular applications. At runtime, G-Charm automatically combines multiple small GPU tasks into a single larger kernel which reduces the number of kernel invocations while improving CUDA occupancy. G-Charm also enables reuse of existing data in GPU global memory, performs GPU memory management and dynamic scheduling of tasks across CPU and GPU in order to reduce idle time. In order to combine the partial results obtained from the computations performed on CPU and GPU, G-Charm allows the user to specify an operator using which the partial results are combined at runtime. We also perform compile time code generation to reduce programming overhead. For Cholesky factorization, a regular parallel application, G-Charm provides 14% improvement over a highly tuned implementation. In the second part of our research, we extended our runtime to overcome the challenges presented by irregular applications such as a periodic generation of tasks, irregular memory access patterns and varying workloads during application execution. We developed models for deciding the number of tasks that can be combined into a kernel based on the rate of task generation, and the GPU occupancy of the tasks. For irregular applications, data reuse results in uncoalesced GPU memory access. We evaluated the effect of altering the global memory access pattern in improving coalesced access. We’ve also developed adaptive methods for hybrid execution on CPU and GPU wherein we consider the varying workloads while scheduling tasks across the CPU and GPU. We demonstrate that our dynamic strategies result in 8-38% reduction in execution times for an N-body simulation application and a molecular dynamics application over the corresponding static strategies that are amenable for regular applications. Graphics Processing Unit (GPU) Parallel Programming (Computer Science) Parallel Programming Models Parallel Programming Frameworks Charm++ (Computer Program Language) HybridAPI-GPU Management Framework G-Charm Framework Accelerator Based Computing Cholesky Factorization Computer Science
8	Modelling and experimental analysis of frequency dependent MIMO channels García Ariza, Alexis Paolo 04 December 2009 (has links) La integración de tecnologías de ulta-wideband, radio-cognitiva y MIMO representa una herramienta podersoa para mejorar la eficiencia espectral de los sistemas de comunicación inalámbricos. En esta dirección, nuevas estrategias para el modelado de canales MIMO y su caracterización se hacen necesarias si se desea investigar cómo la frecuencia central y el acho de banda afectan el desempeño de los sistemas MIMO. Investigaciones preliminares han enfocado menos atención en cómo estos parámetros afectan las características del canal MIMO. Se presenta una caracterización del canal MIMO en función de la frecuencia, abondándose puntos de vista experimentales y teóricos. Los problemas indicados tratan cinco áreas principales: medidas, post-procesado de datos, generación sintética del canal, estadística multivariable para datos y modelado del canal. Se ha diseñado y validado un sistema de medida basado en un analizador vectorial de redes y se han ejecutado medidas entre 2 y 12 GHz en condiciones estáticas, tanto en línea de vista como no línea de vista. Se ha propuesto y validado un procedimiento confiable para post-procesado, generación sintética de canal y análisis experimental basado en medidas en el dominio de frecuencia. El procedimiento experimental se ha focalizado en matrices de transferencia del canal para casos no selectivos en frecuencia, estimándose además las matrices complejas de covarianza, aplicándose la factorización de Cholesky sobre ls CCM y obteniéndose finalmente matrices de coloreado del sistema. Se presenta un procedimiento de corrección para generación sintética del canal aplicado a casos MIMO de grandes dimensiones y cuando la CCM es indefinida. Este CP permite la factorización de Cholesky y de dichas CCM. Las características multivariables de los datos experimentales han sido investigadas, realizándose un test de normalidad compleja multivariable. / García Ariza, AP. (2009). Modelling and experimental analysis of frequency dependent MIMO channels [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/6563 Wssus Normality tests Mimo Correlated channels Wideband channels Multivariate normality Cholesky factorization Indefinite matrices Homogeneous plane waves Synthetic channels Alternating projection method Perfect mimo channel modelling In-homogeneous plane waves Channel sounding Channel modelling TEORIA DE LA SEÑAL Y COMUNICACIONES
9	Χωροχρονικές τεχνικές επεξεργασίας σήματος σε ασύρματα τηλεπικοινωνιακά δίκτυα / Space -Time signal processing techniques for wireless communication networks Κεκάτος, Βασίλειος 25 October 2007 (has links) Τα τελευταία χρόνια χαρακτηρίζονται από μια αλματώδη ανάπτυξη των προϊόντων και υπηρεσιών που βασίζονται στα δίκτυα ασύρματης επικοινωνίας, ενώ προκύπτουν σημαντικές ερευνητικές προκλήσεις. Τα συστήματα πολλαπλών κεραιών στον πομπό και στο δέκτη, γνωστά και ως συστήματα MIMO (multi-input multi-output), καθώς και η τεχνολογία πολλαπλής προσπέλασης με χρήση κωδικών (code division multiple access, CDMA) αποτελούν δύο από τα βασικά μέτωπα ανάπτυξης των ασύρματων τηλεπικοινωνιών. Στα πλαίσια της παρούσας διδακτορικής διατριβής, ασχοληθήκαμε με την ανάπτυξη και μελέτη αλγορίθμων επεξεργασίας σήματος για τα δύο παραπάνω συστήματα, όπως περιγράφεται αναλυτικά παρακάτω. Σχετικά με τα συστήματα MIMO, η πρωτοποριακή έρευνα που πραγματοποιήθηκε στα Bell Labs γύρω στα 1996, όπου αναπτύχθηκε η αρχιτεκτονική BLAST (Bell Labs Layered Space-Time), απέδειξε ότι η χρήση πολλαπλών κεραιών μπορεί να οδηγήσει σε σημαντική αύξηση της χωρητικότητας των ασύρματων συστημάτων. Προκειμένου να αξιοποιηθούν οι παραπάνω δυνατότητες, απαιτείται η σχεδίαση σύνθετων δεκτών MIMO. Προς αυτήν την κατεύθυνση, έχει προταθεί ένας μεγάλος αριθμός μεθόδων ισοστάθμισης του καναλιού. Ωστόσο, οι περισσότερες από αυτές υποθέτουν ότι το ασύρματο κανάλι είναι: 1) χρονικά σταθερό, 2) συχνοτικά επίπεδο (δεν εισάγει διασυμβολική παρεμβολή), και κυρίως 3) ότι είναι γνωστό στο δέκτη. Δεδομένου ότι σε ευρυζωνικά συστήματα μονής φέρουσας οι παραπάνω υποθέσεις είναι δύσκολο να ικανοποιηθούν, στραφήκαμε προς τις προσαρμοστικές μεθόδους ισοστάθμισης. Συγκεκριμένα, αναπτύξαμε τρεις βασικούς αλγορίθμους. Ο πρώτος αλγόριθμος αποτελεί έναν προσαρμοστικό ισοσταθμιστή ανάδρασης αποφάσεων (decision feedback equalizer, DFE) για συχνοτικά επίπεδα κανάλια ΜΙΜΟ. Ο προτεινόμενος MIMO DFE ακολουθεί την αρχιτεκτονική BLAST, και ανανεώνεται με βάση τον αλγόριθμο αναδρομικών ελαχίστων τετραγώνων (RLS) τετραγωνικής ρίζας. Ο ισοσταθμιστής μπορεί να παρακολουθήσει ένα χρονικά μεταβαλλόμενο κανάλι, και, από όσο γνωρίζουμε, έχει τη χαμηλότερη πολυπλοκότητα από όλους τους δέκτες BLAST που έχουν προταθεί έως σήμερα. Ο δεύτερος αλγόριθμος αποτελεί την επέκταση του προηγούμενου σε συχνοτικά επιλεκτικά κανάλια. Μέσω κατάλληλης μοντελοποίησης του προβλήματος ισοστάθμισης, οδηγηθήκαμε σε έναν αποδοτικό DFE για ευρυζωνικά κανάλια MIMO. Τότε, η διαδικασία της ισοστάθμισης εμφανίζει προβλήματα αριθμητικής ευστάθειας, που λόγω της υλοποίησης RLS τετραγωνικής ρίζας αντιμετωπίστηκαν επιτυχώς. Κινούμενοι προς την κατεύθυνση περαιτέρω μείωσης της πολυπλοκότητας, προτείναμε έναν προσαρμοστικό MIMO DFE που ανανεώνεται με βάση τον αλγόριθμο ελαχίστων μέσων τετραγώνων (LMS) υλοποιημένο εξ ολοκλήρου στο πεδίο της συχνότητας. Με χρήση του ταχύ μετασχηματισμού Fourier (FFT), μειώνεται η απαιτούμενη πολυπλοκότητα. Παράλληλα, η μετάβαση στο πεδίο των συχνοτήτων έχει ως αποτέλεσμα την προσεγγιστική διαγωνοποίηση του συστήματος, προσφέροντας ανεξάρτητη ανανέωση των φίλτρων ανά συχνοτική συνιστώσα και επιτάχυνση της σύγκλισης του αλγορίθμου. Ο προτεινόμενος ισοσταθμιστής πετυχαίνει μια καλή ανταλλαγή μεταξύ απόδοσης και πολυπλοκότητας. Παράλληλα με τα παραπάνω, ασχοληθήκαμε με την εκτίμηση του ασύρματου καναλιού σε ένα ασύγχρονο σύστημα CDMA. Το βασικό σενάριο είναι ότι ο σταθμός βάσης γνωρίζει ήδη τους ενεργούς χρήστες, και καλείται να εκτιμήσει τις παραμέτρους του καναλιού ανερχόμενης ζεύξης ενός νέου χρήστη που εισέρχεται στο σύστημα. Το πρόβλημα περιγράφεται από μια συνάρτηση ελαχίστων τετραγώνων, η οποία είναι γραμμική ως προς τα κέρδη του καναλιού, και μη γραμμική ως προς τις καθυστερήσεις του. Αποδείξαμε ότι το πρόβλημα έχει μια προσεγγιστικά διαχωρίσιμη μορφή, και προτείναμε μια επαναληπτική μέθοδο υπολογισμού των παραμέτρων. Ο προτεινόμενος αλγόριθμος δεν απαιτεί κάποια ειδική ακολουθία διάχυσης και λειτουργεί αποδοτικά ακόμη και για περιορισμένη ακολουθία εκπαίδευσης. Είναι εύρωστος στην παρεμβολή πολλαπλών χρηστών και περισσότερο ακριβής από μια υπάρχουσα μέθοδο εις βάρος μιας ασήμαντης αύξησης στην υπολογιστική πολυπλοκότητα. / Over the last decades, a dramatic progress in the products and services based on wireless communication networks has been observed, while, at the same time, new research challenges arise. The systems employing multiple antennas at the transmitter and the receiver, known as MIMO (multi-input multi-output) systems, as well as code division multiple access (CDMA) systems, are two of the main technologies employed for the evolution of wireless communications. During this PhD thesis, we worked on the design and analysis of signal processing algorithms for the two above systems, as it is described in detail next. Concerning the MIMO systems, the pioneering work performed at Bell Labs around 1996, where the BLAST (Bell Labs Layered Space-Time) architecture has been developed, proved that by using multiple antennas can lead to a significant increase in wireless systems capacity. To exploit this potential, sophisticated MIMO receivers should be designed. To this end, a large amount of channel equalizers has been proposed. However, most of these methods assume that the wireless channel is: 1) static, 2) frequency flat (no intersymbol interference is introduced), and mainly 3) it is perfectly known at the receiver. Provided that in high rate single carrier systems these assumptions are difficult to be met, we focused our attention on adaptive equalization methods. More specifically, three basic algorithms have been developed. The first algorithm is an adaptive decision feedback equalizer (DFE) for frequency flat MIMO channels. The proposed MIMO DFE implements the BLAST architecture, and it is updated by the recursive least squares (RLS) algorithm in its square root form. The new equalizer can track time varying channels, and, to the best of our knowledge, it has the lowest computational complexity among the BLAST receivers that have been proposed up to now. The second algorithm is an extension of the previous one to the frequency selective channel case. By proper modeling of the equalization problem, we arrived at an efficient DFE for wideband MIMO channels. In this case, the equalization process encounters numerical instability problems, which were successfully treated by the square root RLS implementation employed. To further reduce complexity, we proposed an adaptive MIMO DFE that is updated by the least mean square (LMS) algorithm, fully implemented in the frequency domain. By using the fast Fourier transform (FFT), the complexity required is considerably reduced. Moreover, the frequency domain implementation leads to an approximate decoupling of the equalization problem at each frequency bin. Thus, an independent update of the filters at each frequency bin allows for a faster convergence of the algorithm. The proposed equalizer offers a good performance - complexity tradeoff. Furthermore, we worked on channel estimation for an asynchronous CDMA system. The assumed scenario is that the base station has already acquired all the active users, while the uplink channel parameters of a new user entering the system should be estimated. The problem can be described via a least squares cost function, which is linear with respect to the channel gains, and non linear to its delays. We proved that the problem is approximately decoupled, and a new iterative parameter estimation method has been proposed. The suggested method does not require any specific pilot sequence and performs well even for a short training interval. It is robust to multiple access interference and more accurate compared to an existing method, at the expense of an insignificant increase in computational complexity. Ισοστάθμιση καναλιού Εκτίμηση καναλιού Παραγοντοποίηση Cholesky Δέκτης 621.382 2 Channel equalization Multiple antennas systems Channel estimation Recursive least squares Least mean squares Cholesky factorization Intersymbol interference Receiver Code division multiple access CDMA MIMO RLS LMS Decision feedback equalizer DFE

Search results