Global ETD Search

21	Adaptive Waveletmethoden zur Approximation von Bildern / Adaptive wavelet methods for the approximation of images Tenorth, Stefanie 08 July 2011 (has links) No description available. 510 Mathematik Mathematics and Computer Science Wavelets dünn besetzte Darstellung von Daten Bildkompression adaptive Wavelet-Basen Tensor-Produkt-Wavelet-Transformation Easy-Path-Wavelet-Transformation N-Term-Approximation sparse data representation wavelet transform along pathways image data compression adaptive wavelet bases tensor product wavelet transform easy path wavelet transform N-term approximation 31.49
22	An optimised QPSK-based receiver structure for possibly sparse data transmission over narrowband and wideband communication systems Schoeman, Johan P. 24 August 2010 (has links) In this dissertation an in-depth study was conducted into the design, implementation and evaluation of a QPSK-based receiver structure for application in a UMTS WCDMA environment. The novelty of this work lies with the specific receiver architecture aimed to optimise the BER performance when possibly sparse data streams are transmitted. This scenario is a real possibility according to Verd´u et al [1] and Hagenauer et al [2–6]. A novel receiver structure was conceptualised, developed and evaluated in both narrowband and wideband scenarios, where it was found to outperform conventional receivers when a sparse data stream is transmitted. In order to reach the main conclusions of this study, it was necessary to develop a realistic simulation platform. The developed platform is capable of simulating a communication system meeting the physical layer requirements of the UMTS WCDMA standard. The platform can also perform narrowband simulations. A flexible channel emulator was developed that may be configured to simulate AWGN channel conditions, frequency non-selective fading (either Rayleigh or Rician with a configurable LOS component and Doppler spread), or a full multipath scenario where each path has a configurable LOS component, Doppler spread, path gain and path delay. It is therefore possible to even simulate a complex, yet realistic, COST207-TU channel model. The platform is also capable of simulating MUI. Each interfering user has a unique and independent multipath fading channel, while sharing the same bandwidth. Finally, the entire platform executes all simulations in baseband for improved simulation times. The research outputs of this work are summarised below: <ul> <li>A parameter, the sparseness measure, was defined in order to quantify the level by which a data stream differs from an equiprobable data stream.</li> <li>A novel source model was proposed and developed to simulate data streams with a specified amount of sparseness.</li> <li>An introductory investigation was undertaken to determine the effect of simple FEC techniques on the sparseness of an encoded data stream.</li> <li>Novel receiver structures for both narrowband and wideband systems were proposed, developed and evaluated for systems where possibly sparse data streams may be transmitted.</li> <li>Analytic expressions were derived to take the effect of sparseness into account in communication systems, including expressions for the joint PDF of a BPSK branch, the optimal decision region of a detector in AWGN conditions as well as the BER performance of a communication system employing the proposed optimal receiver in both AWGN channel conditions as well as in flat fading channel conditions.</li> <li>Numerous BER performance curves were obtained comparing the proposed receiver structure with conventional receivers in a variety of channel conditions, including AWGN, frequency non-selective fading and a multipath COST207-TU channel environment, as well as the effect of MUI</li></ul>. AFRIKAANS : In hierdie verhandeling word ’n in-diepte studie gedoen rakende die ontwerp, implementasie en evaluasie van ’n KPSK-gebaseerde ontvanger struktuur wat in ’n UMTS WKVVT omgewing gebruik kan word. Die bydrae van hierdie werk lˆe in die spesifieke ontvanger argitektuur wat daarop mik om die BFT werksverrigting te optimeer wanneer yl data strome versend word. Hierdie is ’n realistiese moontlikheid volgens Verd´u et al [1] en Hagenauer et al [2–6]. ’n Nuwe ontvanger struktuur is gekonsepsualiseer, ontwikkel en evalueer vir beide noueband en wyeband stelsels, waar dit gevind is dat dit beter werksverrigting lewer as tradisionele ontvangers wanneer yl data strome versend word. Dit was nodig om ’n realistiese simulasie platform te ontwikkel om die belangrikste gevolgtrekkings van hierdie studie te kan maak. Die ontwikkelde platform is in staat om ’n kommunikasie stelsel te simuleer wat aan die fisiese laag vereistes van die UMTS WKVVT standaard voldoen. Die platform kan ook noueband stelsels simuleer. ’n Aanpasbare kanaal simulator is ontwikkel wat opgestel kan word om SWGR kanaal toestande, plat duining (beide Rayleigh of Ricies met ’n verstelbare siglyn komponent en Doppler verspreiding), sowel as ’n veelvuldige pad omgewing (waar elke unieke pad ’n verstelbare siglyn komponent, Doppler verspreiding, pad wins en pad vertraging het) te emuleer. Dit is selfs moontlik om ’n komplekse, maar steeds realistiese COST207-TU kanaal model te simuleer. Die platform het ook die vermo¨e om VGS te simuleer. Elke steurende gebruiker het ’n unieke en onafhanklike veelvuldige pad deinende kanaal, terwyl dieselfde bandwydte gedeel word. Laastens, alle simulasies van die platvorm word in basisband uitgevoer wat verkorte simulasie periodes verseker. Die navorsingsuitsette van hierdie werk kan as volg opgesom word: <ul> <li>’n Parameter, die ylheidsmaatstaf, is gedefin¨ýeer om dit moontlik te maak om die vlak waarmee die ylheid van ’n datastroom verskil van ’n ewekansige stroom te versyfer.</li> <li>’n Nuwe bronmodel is voorgestel en ontwikkel om datastrome met ’n spesifieke ylheid te emuleer.</li> <li>’n Inleidende ondersoek is onderneem om vas te stel wat die effek van VFK tegnieke op die ylheid van ’n enkodeerde datastroom is.</li> <li>Nuwe ontvanger strukture is voorgestel, ontwikkel en evalueer vir beide noueband en wyeband stelsels waar yl datastrome moontlik versend kan word.</li> <li>Analitiese uitdrukkings is afgelei om die effek van ylheid in ag te neem in kommunikasie stelsels. Uitdrukkings vir onder andere die gedeelte WDF van ’n BFVK tak, die optimale beslissingspunt van ’n detektor in SWGR toestande, sowel as die BFT werksverrigting van ’n kommunikasie stelsel wat van die voorgestelde optimale ontvangers gebruik maak, hetsy in SWGR of in plat duinende kanaal toestande.</li> <li>Talryke BFT werksverrigting krommes is verkry wat die voorgestelde ontvanger struktuur vergelyk met die konvensionele ontvangers in ’n verskeidenheid kanaal toestande, insluitend SWGR, plat duinende kanale en ’n veelvuldige pad COST207-TU kanaal omgewing, sowel as in die teenwoordigheid van VGS.</li></ul></p Copyright / Dissertation (MEng)--University of Pretoria, 2010. / Electrical, Electronic and Computer Engineering / unrestricted Wkvvt Kanaal modellering Wireless communication Bft Cost modelle Draadlose kommunikasie Wcdma Sparseness measure Sparse data Scrambling codes Rrc filter Rake receiver Qpsk Ovsf codes Multipath fading Vgs Veelvuldige pad deining Umts Cost models Swgr Mui Flat fading Channelisation codes Channel modelling 4g Cdma Awgn Ber Kdvt Kanalisasie kodes Mengkodes Kpsk Ovsf kodes Plat deining Rake ontvanger Wgc filter Yl data Ylheidsmaatstaf UCTD
23	Hard and fuzzy block clustering algorithms for high dimensional data / Algorithmes de block-clustering dur et flou pour les données en grande dimension Laclau, Charlotte 14 April 2016 (has links) Notre capacité grandissante à collecter et stocker des données a fait de l'apprentissage non supervisé un outil indispensable qui permet la découverte de structures et de modèles sous-jacents aux données, sans avoir à \étiqueter les individus manuellement. Parmi les différentes approches proposées pour aborder ce type de problème, le clustering est très certainement le plus répandu. Le clustering suppose que chaque groupe, également appelé cluster, est distribué autour d'un centre défini en fonction des valeurs qu'il prend pour l'ensemble des variables. Cependant, dans certaines applications du monde réel, et notamment dans le cas de données de dimension importante, cette hypothèse peut être invalidée. Aussi, les algorithmes de co-clustering ont-ils été proposés: ils décrivent les groupes d'individus par un ou plusieurs sous-ensembles de variables au regard de leur pertinence. La structure des données finalement obtenue est composée de blocs communément appelés co-clusters. Dans les deux premiers chapitres de cette thèse, nous présentons deux approches de co-clustering permettant de différencier les variables pertinentes du bruit en fonction de leur capacité \`a révéler la structure latente des données, dans un cadre probabiliste d'une part et basée sur la notion de métrique, d'autre part. L'approche probabiliste utilise le principe des modèles de mélanges, et suppose que les variables non pertinentes sont distribuées selon une loi de probabilité dont les paramètres sont indépendants de la partition des données en cluster. L'approche métrique est fondée sur l'utilisation d'une distance adaptative permettant d'affecter à chaque variable un poids définissant sa contribution au co-clustering. D'un point de vue théorique, nous démontrons la convergence des algorithmes proposés en nous appuyant sur le théorème de convergence de Zangwill. Dans les deux chapitres suivants, nous considérons un cas particulier de structure en co-clustering, qui suppose que chaque sous-ensemble d'individus et décrit par un unique sous-ensemble de variables. La réorganisation de la matrice originale selon les partitions obtenues sous cette hypothèse révèle alors une structure de blocks homogènes diagonaux. Comme pour les deux contributions précédentes, nous nous plaçons dans le cadre probabiliste et métrique. L'idée principale des méthodes proposées est d'imposer deux types de contraintes : (1) nous fixons le même nombre de cluster pour les individus et les variables; (2) nous cherchons une structure de la matrice de données d'origine qui possède les valeurs maximales sur sa diagonale (par exemple pour le cas des données binaires, on cherche des blocs diagonaux majoritairement composés de valeurs 1, et de 0 à l’extérieur de la diagonale). Les approches proposées bénéficient des garanties de convergence issues des résultats des chapitres précédents. Enfin, pour chaque chapitre, nous dérivons des algorithmes permettant d'obtenir des partitions dures et floues. Nous évaluons nos contributions sur un large éventail de données simulées et liées a des applications réelles telles que le text mining, dont les données peuvent être binaires ou continues. Ces expérimentations nous permettent également de mettre en avant les avantages et les inconvénients des différentes approches proposées. Pour conclure, nous pensons que cette thèse couvre explicitement une grande majorité des scénarios possibles découlant du co-clustering flou et dur, et peut être vu comme une généralisation de certaines approches de biclustering populaires. / With the increasing number of data available, unsupervised learning has become an important tool used to discover underlying patterns without the need to label instances manually. Among different approaches proposed to tackle this problem, clustering is arguably the most popular one. Clustering is usually based on the assumption that each group, also called cluster, is distributed around a center defined in terms of all features while in some real-world applications dealing with high-dimensional data, this assumption may be false. To this end, co-clustering algorithms were proposed to describe clusters by subsets of features that are the most relevant to them. The obtained latent structure of data is composed of blocks usually called co-clusters. In first two chapters, we describe two co-clustering methods that proceed by differentiating the relevance of features calculated with respect to their capability of revealing the latent structure of the data in both probabilistic and distance-based framework. The probabilistic approach uses the mixture model framework where the irrelevant features are assumed to have a different probability distribution that is independent of the co-clustering structure. On the other hand, the distance-based (also called metric-based) approach relied on the adaptive metric where each variable is assigned with its weight that defines its contribution in the resulting co-clustering. From the theoretical point of view, we show the global convergence of the proposed algorithms using Zangwill convergence theorem. In the last two chapters, we consider a special case of co-clustering where contrary to the original setting, each subset of instances is described by a unique subset of features resulting in a diagonal structure of the initial data matrix. Same as for the two first contributions, we consider both probabilistic and metric-based approaches. The main idea of the proposed contributions is to impose two different kinds of constraints: (1) we fix the number of row clusters to the number of column clusters; (2) we seek a structure of the original data matrix that has the maximum values on its diagonal (for instance for binary data, we look for diagonal blocks composed of ones with zeros outside the main diagonal). The proposed approaches enjoy the convergence guarantees derived from the results of the previous chapters. Finally, we present both hard and fuzzy versions of the proposed algorithms. We evaluate our contributions on a wide variety of synthetic and real-world benchmark binary and continuous data sets related to text mining applications and analyze advantages and inconvenients of each approach. To conclude, we believe that this thesis covers explicitly a vast majority of possible scenarios arising in hard and fuzzy co-clustering and can be seen as a generalization of some popular biclustering approaches. Classification Flou Classification croisée Modèle de mélange Approche métrique Modèle à bloc latent Données sparses Données binaires Classification de document Théorème de Zangwill Sélection de variable Données en grande dimension Algorithme Clustering Fuzzy Co-clustering Mixture model Metric approach Latent block model Sparse data Binary data Document clustering Zangwill theorem Feature selection High dimensional data Algorithm 004
24	Von Mises-Fisher based (co-)clustering for high-dimensional sparse data : application to text and collaborative filtering data / Modèles de mélange de von Mises-Fisher pour la classification simple et croisée de données éparses de grande dimension Salah, Aghiles 21 November 2016 (has links) La classification automatique, qui consiste à regrouper des objets similaires au sein de groupes, également appelés classes ou clusters, est sans aucun doute l’une des méthodes d’apprentissage non-supervisé les plus utiles dans le contexte du Big Data. En effet, avec l’expansion des volumes de données disponibles, notamment sur le web, la classification ne cesse de gagner en importance dans le domaine de la science des données pour la réalisation de différentes tâches, telles que le résumé automatique, la réduction de dimension, la visualisation, la détection d’anomalies, l’accélération des moteurs de recherche, l’organisation d’énormes ensembles de données, etc. De nombreuses méthodes de classification ont été développées à ce jour, ces dernières sont cependant fortement mises en difficulté par les caractéristiques complexes des ensembles de données que l’on rencontre dans certains domaines d’actualité tel que le Filtrage Collaboratif (FC) et de la fouille de textes. Ces données, souvent représentées sous forme de matrices, sont de très grande dimension (des milliers de variables) et extrêmement creuses (ou sparses, avec plus de 95% de zéros). En plus d’être de grande dimension et sparse, les données rencontrées dans les domaines mentionnés ci-dessus sont également de nature directionnelles. En effet, plusieurs études antérieures ont démontré empiriquement que les mesures directionnelles, telle que la similarité cosinus, sont supérieurs à d’autres mesures, telle que la distance Euclidiennes, pour la classification des documents textuels ou pour mesurer les similitudes entre les utilisateurs/items dans le FC. Cela suggère que, dans un tel contexte, c’est la direction d’un vecteur de données (e.g., représentant un document texte) qui est pertinente, et non pas sa longueur. Il est intéressant de noter que la similarité cosinus est exactement le produit scalaire entre des vecteurs unitaires (de norme 1). Ainsi, d’un point de vue probabiliste l’utilisation de la similarité cosinus revient à supposer que les données sont directionnelles et réparties sur la surface d’une hypersphère unité. En dépit des nombreuses preuves empiriques suggérant que certains ensembles de données sparses et de grande dimension sont mieux modélisés sur une hypersphère unité, la plupart des modèles existants dans le contexte de la fouille de textes et du FC s’appuient sur des hypothèses populaires : distributions Gaussiennes ou Multinomiales, qui sont malheureusement inadéquates pour des données directionnelles. Dans cette thèse, nous nous focalisons sur deux challenges d’actualité, à savoir la classification des documents textuels et la recommandation d’items, qui ne cesse d’attirer l’attention dans les domaines de la fouille de textes et celui du filtrage collaborative, respectivement. Afin de répondre aux limitations ci-dessus, nous proposons une série de nouveaux modèles et algorithmes qui s’appuient sur la distribution de von Mises-Fisher (vMF) qui est plus appropriée aux données directionnelles distribuées sur une hypersphère unité. / Cluster analysis or clustering, which aims to group together similar objects, is undoubtedly a very powerful unsupervised learning technique. With the growing amount of available data, clustering is increasingly gaining in importance in various areas of data science for several reasons such as automatic summarization, dimensionality reduction, visualization, outlier detection, speed up research engines, organization of huge data sets, etc. Existing clustering approaches are, however, severely challenged by the high dimensionality and extreme sparsity of the data sets arising in some current areas of interest, such as Collaborative Filtering (CF) and text mining. Such data often consists of thousands of features and more than 95% of zero entries. In addition to being high dimensional and sparse, the data sets encountered in the aforementioned domains are also directional in nature. In fact, several previous studies have empirically demonstrated that directional measures—that measure the distance between objects relative to the angle between them—, such as the cosine similarity, are substantially superior to other measures such as Euclidean distortions, for clustering text documents or assessing the similarities between users/items in CF. This suggests that in such context only the direction of a data vector (e.g., text document) is relevant, not its magnitude. It is worth noting that the cosine similarity is exactly the scalar product between unit length data vectors, i.e., L 2 normalized vectors. Thus, from a probabilistic perspective using the cosine similarity is equivalent to assuming that the data are directional data distributed on the surface of a unit-hypersphere. Despite the substantial empirical evidence that certain high dimensional sparse data sets, such as those encountered in the above domains, are better modeled as directional data, most existing models in text mining and CF are based on popular assumptions such as Gaussian, Multinomial or Bernoulli which are inadequate for L 2 normalized data. In this thesis, we focus on the two challenging tasks of text document clustering and item recommendation, which are still attracting a lot of attention in the domains of text mining and CF, respectively. In order to address the above limitations, we propose a suite of new models and algorithms which rely on the von Mises-Fisher (vMF) assumption that arises naturally for directional data lying on a unit-hypersphere. Apprentissage statistique Classification Classification croisée Modèles de mélanges Statistiques directionnelles Distribution de von Mises-Fisher Fouille de textes Systèmes de recommandation Filtrage collaboratif Matrices creuses Grande dimension Machine learning Clustering Co-clustering Mixture models Directional statistics Von Mises-Fisher distribution Text mining Recommender systems Collaborative filtering Sparse data High dimensional data 003.3
25	Arquitectura de un sistema de geo-visualización espacio-temporal de actividad delictiva, basada en el análisis masivo de datos, aplicada a sistemas de información de comando y control (C2IS) Salcedo González, Mayra Liliana 03 April 2023 (has links) [ES] La presente tesis doctoral propone la arquitectura de un sistema de Geo-visualización Espaciotemporal de actividad delictiva y criminal, para ser aplicada a Sistemas de Comando y Control (C2S) específicamente dentro de sus Sistemas de Información de Comando y Control (C2IS). El sistema de Geo-visualización Espaciotemporal se basa en el análisis masivo de datos reales de actividad delictiva, proporcionado por la Policía Nacional Colombiana (PONAL) y está compuesto por dos aplicaciones diferentes: la primera permite al usuario geo-visualizar espaciotemporalmente de forma dinámica, las concentraciones, tendencias y patrones de movilidad de esta actividad dentro de la extensión de área geográfica y el rango de fechas y horas que se precise, lo cual permite al usuario realizar análisis e interpretaciones y tomar decisiones estratégicas de acción más acertadas; la segunda aplicación permite al usuario geo-visualizar espaciotemporalmente las predicciones de la actividad delictiva en periodos continuos y cortos a modo de tiempo real, esto también dentro de la extensión de área geográfica y el rango de fechas y horas de elección del usuario. Para estas predicciones se usaron técnicas clásicas y técnicas de Machine Learning (incluido el Deep Learning), adecuadas para el pronóstico en multiparalelo de varios pasos de series temporales multivariantes con datos escasos. Las dos aplicaciones del sistema, cuyo desarrollo se muestra en esta tesis, están realizadas con métodos novedosos que permitieron lograr estos objetivos de efectividad a la hora de detectar el volumen y los patrones y tendencias en el desplazamiento de dicha actividad, mejorando así la conciencia situacional, la proyección futura y la agilidad y eficiencia en los procesos de toma de decisiones, particularmente en la gestión de los recursos destinados a la disuasión, prevención y control del delito, lo cual contribuye a los objetivos de ciudad segura y por consiguiente de ciudad inteligente, dentro de arquitecturas de Sistemas de Comando y Control (C2S) como en el caso de los Centros de Comando y Control de Seguridad Ciudadana de la PONAL. / [CA] Aquesta tesi doctoral proposa l'arquitectura d'un sistema de Geo-visualització Espaitemporal d'activitat delictiva i criminal, per ser aplicada a Sistemes de Comandament i Control (C2S) específicament dins dels seus Sistemes d'informació de Comandament i Control (C2IS). El sistema de Geo-visualització Espaitemporal es basa en l'anàlisi massiva de dades reals d'activitat delictiva, proporcionada per la Policia Nacional Colombiana (PONAL) i està composta per dues aplicacions diferents: la primera permet a l'usuari geo-visualitzar espaitemporalment de forma dinàmica, les concentracions, les tendències i els patrons de mobilitat d'aquesta activitat dins de l'extensió d'àrea geogràfica i el rang de dates i hores que calgui, la qual cosa permet a l'usuari fer anàlisis i interpretacions i prendre decisions estratègiques d'acció més encertades; la segona aplicació permet a l'usuari geovisualitzar espaciotemporalment les prediccions de l'activitat delictiva en períodes continus i curts a mode de temps real, això també dins l'extensió d'àrea geogràfica i el rang de dates i hores d'elecció de l'usuari. Per a aquestes prediccions es van usar tècniques clàssiques i tècniques de Machine Learning (inclòs el Deep Learning), adequades per al pronòstic en multiparal·lel de diversos passos de sèries temporals multivariants amb dades escasses. Les dues aplicacions del sistema, el desenvolupament de les quals es mostra en aquesta tesi, estan realitzades amb mètodes nous que van permetre assolir aquests objectius d'efectivitat a l'hora de detectar el volum i els patrons i les tendències en el desplaçament d'aquesta activitat, millorant així la consciència situacional , la projecció futura i l'agilitat i eficiència en els processos de presa de decisions, particularment en la gestió dels recursos destinats a la dissuasió, prevenció i control del delicte, la qual cosa contribueix als objectius de ciutat segura i per tant de ciutat intel·ligent , dins arquitectures de Sistemes de Comandament i Control (C2S) com en el cas dels Centres de Comandament i Control de Seguretat Ciutadana de la PONAL. / [EN] This doctoral thesis proposes the architecture of a Spatiotemporal Geo-visualization system of criminal activity, to be applied to Command and Control Systems (C2S) specifically within their Command and Control Information Systems (C2IS). The Spatiotemporal Geo-visualization system is based on the massive analysis of real data of criminal activity, provided by the Colombian National Police (PONAL) and is made up of two different applications: the first allows the user to dynamically geo-visualize spatiotemporally, the concentrations, trends and patterns of mobility of this activity within the extension of the geographic area and the range of dates and times that are required, which allows the user to carry out analyses and interpretations and make more accurate strategic action decisions; the second application allows the user to spatially visualize the predictions of criminal activity in continuous and short periods like in real time, this also within the extension of the geographic area and the range of dates and times of the user's choice. For these predictions, classical techniques and Machine Learning techniques (including Deep Learning) were used, suitable for multistep multiparallel forecasting of multivariate time series with sparse data. The two applications of the system, whose development is shown in this thesis, are carried out with innovative methods that allowed achieving these effectiveness objectives when detecting the volume and patterns and trends in the movement of said activity, thus improving situational awareness, the future projection and the agility and efficiency in the decision-making processes, particularly in the management of the resources destined to the dissuasion, prevention and control of crime, which contributes to the objectives of a safe city and therefore of a smart city, within architectures of Command and Control Systems (C2S) as in the case of the Citizen Security Command and Control Centers of the PONAL. / Salcedo González, ML. (2023). Arquitectura de un sistema de geo-visualización espacio-temporal de actividad delictiva, basada en el análisis masivo de datos, aplicada a sistemas de información de comando y control (C2IS) [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/192685 Temporal data space Predictive geo-visualisation Command and control systems (C2S) Efficiency improvement Decision-making Future projection Real-time systems Sparse data Multivariate time series Forecasting of criminal activity Smart city Dynamic geo-visualisation of data Situational awareness Mobility of criminal activity Vector autoregressive (VAR) Multilayer perceptron (MLP) Espacio temporal de datos Geo-visualización predictiva Sistemas de mando y control (C2S) Mejora de la eficiencia Toma de decisiones Proyección futura Sistemas de tiempo real Datos dispersos Pronóstico multipaso y multiparalelo Series temporales multivariantes Pronóstico de actividad delictiva Ciudad segura Ciudad inteligente Geo-visualización dinámica de datos Conciencia situacional Movilidad de actividad delictiva INGENIERÍA TELEMÁTICA

Page generated in 0.0348 seconds