Global ETD Search

11	Error resilience and concealment in MVC video over wireless networks Ibrahim, Abdulkareem B. January 2015 (has links) Multi-view video is capable of presenting a full and accurate depth perception of a scene. The concept of multi-view video is becoming more useful especially in 3D display systems by enhancing the viewing of high resolution stereoscopic images from arbitrary viewpoints without the use of any special glasses. Like monoscopic video, the multi-view video is faced with different challenges such as: reliable compression, storage and bandwidth due to the increased number of views as well as the high sensitivity to transmission errors. All these may lead to a detrimental effect on the reconstructed views. The work in this thesis investigates the problems and challenges of transmission losses in a multi-view video bitstream over error prone wireless networks. Based on the network simulation results, the proposed technique is capable of addressing the problem of transmission losses. In practical wireless networks, transmission errors are inevitable and pose a serious challenge to the coded video data. The aim of this research effort is to examine the effect of these errors in a multi-view video bitstream when transmitted over a lossy channel. Moreover, this research work aims to develop a novel scheme that can make the multi-view coded videos more robust to transmission errors by minimizing the error effects and improving the perceptual quality. Multi-layer data partitioning as an error resilient technique is developed in JMVC 8.5 reference software in order to make the multi-view video bitstream more robust during transmission. In addition to that, we propose a simple decoding scheme that can support the decoding of the multi-layer data partitioning bitstream over channels with high error rate. The proposed technique is benchmarked with the already existing H.264/AVC data partitioning technique. The work in this thesis also employs the use of group of pictures as a coding parameter to investigate and reduce the effects of transmission errors in multi-view video transmitted over a very high error rate channel. The experiments are carried out with different error loss rates in order to evaluate the performance of these techniques in terms of perceptual quality when transmitted over a simulated erroneous channel. Errors are introduced using the Sirannon network simulator. The error performance of each technique is evaluated and analysed both objectively and subjectively after reconstruction. The results of the research investigation and simulation are presented and analysed in chapter six of the thesis. 621.382
12	Discovering Compact and Informative Structures through Data Partitioning Fiterau, Madalina 01 September 2015 (has links) In many practical scenarios, prediction for high-dimensional observations can be accurately performed using only a fraction of the existing features. However, the set of relevant predictive features, known as the sparsity pattern, varies across data. For instance, features that are informative for a subset of observations might be useless for the rest. In fact, in such cases, the dataset can be seen as an aggregation of samples belonging to several low-dimensional sub-models, potentially due to different generative processes. My thesis introduces several techniques for identifying sparse predictive structures and the areas of the feature space where these structures are effective. This information allows the training of models which perform better than those obtained through traditional feature selection. We formalize Informative Projection Recovery, the problem of extracting a set of low-dimensional projections of data which jointly form an accurate solution to a given learning task. Our solution to this problem is a regression-based algorithm that identifies informative projections by optimizing over a matrix of point-wise loss estimators. It generalizes to a number of machine learning problems, offering solutions to classification, clustering and regression tasks. Experiments show that our method can discover and leverage low-dimensional structure, yielding accurate and compact models. Our method is particularly useful in applications involving multivariate numeric data in which expert assessment of the results is of the essence. Additionally, we developed an active learning framework which works with the obtained compact models in finding unlabeled data deemed to be worth expert evaluation. For this purpose, we enhance standard active selection criteria using the information encapsulated by the trained model. The advantage of our approach is that the labeling effort is expended mainly on samples which benefit models from the hypothesis class we are considering. Additionally, the domain experts benefit from the availability of informative axis aligned projections at the time of labeling. Experiments show that this results in an improved learning rate over standard selection criteria, both for synthetic data and real-world data from the clinical domain, while the comprehensible view of the data supports the labeling process and helps preempt labeling errors. informative projection recovery cost-based feature selection ensemble methods data partitioning active learning clinical data analysis
13	Partitionnement dans les systèmes de gestion de données parallèles / Data Partitioning in Parallel Data Management Systems Liroz Gistau, Miguel 17 December 2013 (has links) Au cours des dernières années, le volume des données qui sont capturées et générées a explosé. Les progrès des technologies informatiques, qui fournissent du stockage à bas prix et une très forte puissance de calcul, ont permis aux organisations d'exécuter des analyses complexes de leurs données et d'en extraire des connaissances précieuses. Cette tendance a été très importante non seulement pour l'industrie, mais a également pour la science, où les meilleures instruments et les simulations les plus complexes ont besoin d'une gestion efficace des quantités énormes de données.Le parallélisme est une technique fondamentale dans la gestion de données extrêmement volumineuses car il tire parti de l'utilisation simultanée de plusieurs ressources informatiques. Pour profiter du calcul parallèle, nous avons besoin de techniques de partitionnement de données efficaces, qui sont en charge de la division de l'ensemble des données en plusieurs partitions et leur attribution aux nœuds de calculs. Le partitionnement de données est un problème complexe, car il doit prendre en compte des questions différentes et souvent contradictoires telles que la localité des données, la répartition de charge et la maximisation du parallélisme.Dans cette thèse, nous étudions le problème de partitionnement de données, en particulier dans les bases de données parallèles scientifiques qui sont continuellement en croissance. Nous étudions également ces partitionnements dans le cadre MapReduce.Dans le premier cas, nous considérons le partitionnement de très grandes bases de données dans lesquelles des nouveaux éléments sont ajoutés en permanence, avec pour exemple une application aux données astronomiques. Les approches existantes sont limitées à cause de la complexité de la charge de travail et l'ajout en continu de nouvelles données limitent l'utilisation d'approches traditionnelles. Nous proposons deux algorithmes de partitionnement dynamique qui attribuent les nouvelles données aux partitions en utilisant une technique basée sur l'affinité. Nos algorithmes permettent d'obtenir de très bons partitionnements des données en un temps d'exécution réduit comparé aux approches traditionnelles.Nous étudions également comment améliorer la performance du framework MapReduce en utilisant des techniques de partitionnement de données. En particulier, nous sommes intéressés par le partitionnement efficient de données d'entrée / During the last years, the volume of data that is captured and generated has exploded. Advances in computer technologies, which provide cheap storage and increased computing capabilities, have allowed organizations to perform complex analysis on this data and to extract valuable knowledge from it. This trend has been very important not only for industry, but has also had a significant impact on science, where enhanced instruments and more complex simulations call for an efficient management of huge quantities of data.Parallel computing is a fundamental technique in the management of large quantities of data as it leverages on the concurrent utilization of multiple computing resources. To take advantage of parallel computing, we need efficient data partitioning techniques which are in charge of dividing the whole data and assigning the partitions to the processing nodes. Data partitioning is a complex problem, as it has to consider different and often contradicting issues, such as data locality, load balancing and maximizing parallelism.In this thesis, we study the problem of data partitioning, particularly in scientific parallel databases that are continuously growing and in the MapReduce framework.In the case of scientific databases, we consider data partitioning in very large databases in which new data is appended continuously to the database, e.g. astronomical applications. Existing approaches are limited since the complexity of the workload and continuous appends restrict the applicability of traditional approaches. We propose two partitioning algorithms that dynamically partition new data elements by a technique based on data affinity. Our algorithms enable us to obtain very good data partitions in a low execution time compared to traditional approaches.We also study how to improve the performance of MapReduce framework using data partitioning techniques. In particular, we are interested in efficient data partitioning of the input datasets to reduce the amount of data that has to be transferred in the shuffle phase. We design and implement a strategy which, by capturing the relationships between input tuples and intermediate keys, obtains an efficient partitioning that can be used to reduce significantly the MapReduce's communication overhead. Partitionnement de données Systèmes parallèles Bases de données parallèles MapReduce Data partitioning Parallel Systems Parallel Databases MapReduce
14	Data Integration Over Horizontally Partitioned Databases In Service-oriented Data Grids Sonmez Sunercan, Hatice Kevser 01 September 2010 (has links) (PDF) Information integration over distributed and heterogeneous resources has been challenging in many terms: coping with various kinds of heterogeneity including data model, platform, access interfaces / coping with various forms of data distribution and maintenance policies, scalability, performance, security and trust, reliability and resilience, legal issues etc. It is obvious that each of these dimensions deserves a separate thread of research efforts. One particular challenge among the ones listed above that is more relevant to the work presented in this thesis is coping with various forms of data distribution and maintenance policies. This thesis aims to provide a service-oriented data integration solution over data Grids for cases where distributed data sources are partitioned with overlapping sections of various proportions. This is an interesting variation which combines both replicated and partitioned data within the same data management framework. Thus, the data management infrastructure has to deal with specific challenges regarding the identification, access and aggregation of partitioned data with varying proportions of overlapping sections. To provide a solution we have extended OGSA-DAI DQP, a well-known service-oriented data access and integration middleware with distributed query processing facilities, by incorporating UnionPartitions operator into its algebra in order to cope with various unusual forms of horizontally partitioned databases. As a result / our solution extends OGSA-DAI DQP, in two points / 1 - A new operator type is added to the algebra to perform a specialized union of the partitions with different characteristics, 2 - OGSA-DAI DQP Federation Description is extended to include some more metadata to facilitate the successful execution of the newly introduced operator. QA Computer Software 76.75-76.765
15	Machine learning algorithms in a distributed context / Maskininlärningalgoritmer i en distribuerad kontext Johansson, Samuel, Wojtulewicz, Karol January 2018 (has links) Interest in distributed approaches to machine learning has increased significantly in recent years due to continuously increasing data sizes for training machine learning models. In this thesis we describe three popular machine learning algorithms: decision trees, Naive Bayes and support vector machines (SVM) and present existing ways of distributing them. We also perform experiments with decision trees distributed with bagging, boosting and hard data partitioning and evaluate them in terms of performance measures such as accuracy, F1 score and execution time. Our experiments show that the execution time of bagging and boosting increase linearly with the number of workers, and that boosting performs significantly better than bagging and hard data partitioning in terms of F1 score. The hard data partitioning algorithm works well for large datasets where the execution time decrease as the number of workers increase without any significant loss in accuracy or F1 score, while the algorithm performs poorly on small data with an increase in execution time and loss in accuracy and F1 score when the number of workers increase. Machine learning ensemble algorithms hard data partitioning decision trees Computer Sciences Datavetenskap (datalogi)
16	Hypergraphs in the Service of Very Large Scale Query Optimization. Application : Data Warehousing / Les hypergraphes au service de l'optimisation de requêtes à très large échelle. Application : Entrepôt de données Boukorca, Ahcène 12 December 2016 (has links) L'apparition du phénomène Big-Data, a conduit à l'arrivée de nouvelles besoins croissants et urgents de partage de données qui a engendré un grand nombre de requêtes que les SGBD doivent gérer. Ce problème a été aggravé par d 'autres besoins de recommandation et d 'exploration des requêtes. Vu que le traitement de données est toujours possible grâce aux solutions liées à l'optimisation de requêtes, la conception physique et l'architecture de déploiement, où ces solutions sont des résultats de problèmes combinatoires basés sur les requêtes, il est indispensable de revoir les méthodes traditionnelles pour répondre aux nouvelles besoins de passage à l'échelle. Cette thèse s'intéresse à ce problème de nombreuses requêtes et propose une approche, implémentée par un Framework appelé Big-Quereis, qui passe à l'échelle et basée sur le hypergraph, une structure de données flexible qui a une grande puissance de modélisation et permet des formulations précises de nombreux problèmes d•combinatoire informatique. Cette approche est. le fruit. de collaboration avec l'entreprise Mentor Graphies. Elle vise à capturer l'interaction de requêtes dans un plan unifié de requêtes et utiliser des algorithmes de partitionnement pour assurer le passage à l'échelle et avoir des structures d'optimisation optimales (vues matérialisées et partitionnement de données). Ce plan unifié est. utilisé dans la phase de déploiement des entrepôts de données parallèles, par le partitionnement de données en fragments et l'allocation de ces fragments dans les noeuds de calcule correspondants. Une étude expérimentale intensive a montré l'intérêt de notre approche en termes de passage à l'échelle des algorithmes et de réduction de temps de réponse de requêtes. / The emergence of the phenomenon Big-Data conducts to the introduction of new increased and urgent needs to share data between users and communities, which has engender a large number of queries that DBMS must handle. This problem has been compounded by other needs of recommendation and exploration of queries. Since data processing is still possible through solutions of query optimization, physical design and deployment architectures, in which these solutions are the results of combinatorial problems based on queries, it is essential to review traditional methods to respond to new needs of scalability. This thesis focuses on the problem of numerous queries and proposes a scalable approach implemented on framework called Big-queries and based on the hypergraph, a flexible data structure, which bas a larger modeling power and may allow accurate formulation of many problems of combinatorial scientific computing. This approach is the result of collaboration with the company Mentor Graphies. It aims to capture the queries interaction in an unified query plan and to use partitioning algorithms to ensure scalability and to optimal optimization structures (materialized views and data partitioning). Also, the unified plan is used in the deploymemt phase of parallel data warehouses, by allowing data partitioning in fragments and allocating these fragments in the correspond processing nodes. Intensive experimental study sbowed the interest of our approach in terms of scaling algorithms and minimization of query response time. Conception physique Fragmentation de données Vues matérialisées Physical design Data partitioning Materialized views
17	Improving Quality of Experience through Performance Optimization of Server-Client Communication Albinsson, Mattias, Andersson, Linus January 2016 (has links) In software engineering it is important to consider how a potential user experiences the system during usage. No software user will have a satisfying experience if they perceive the system as slow, unresponsive, unstable or hiding information. Additionally, if the system restricts the users to only having a limited set of actions, their experience will further degrade. In order to evaluate the effect these issues have on a user‟s perceived experience, a measure called Quality of Experience is applied. In this work the foremost objective was to improve how a user experienced a system suffering from the previously mentioned issues, when searching for large amounts of data. To achieve this objective the system was evaluated to identify the issues present and which issues were affecting the user perceived Quality of Experience the most. The evaluated system was a warehouse management system developed and maintained by Aptean AB‟s office in Hässleholm, Sweden. The system consisted of multiple clients and a server, sending data over a network. Evaluation of the system was in form of a case study analyzing its performance, together with a survey performed by Aptean staff to gain knowledge of how the system was experienced when searching for large amounts of data. From the results, three issues impacting Quality of Experience the most were identified: (1) interaction; limited set of actions during a search, (2) transparency; limited representation of search progress and received data, (3) execution time; search completion taking long time. After the system was analyzed, hypothesized technological solutions were implemented to resolve the identified issues. The first solution divided the data into multiple partitions, the second decreased data size sent over the network by applying compression and the third was a combination of the two technologies. Following the implementations, a final set of measurements together with the same survey was performed to compare the solutions based on their performance and improvement gained in perceived Quality of Experience. The most significant improvement in perceived Quality of Experience was achieved by the data partitioning solution. While the combination of solutions offered a slight further improvement, it was primarily thanks to data partitioning, making that technology a more suitable solution for the identified issues compared to compression which only slightly improved perceived Quality of Experience. When the data was partitioned, updates were sent more frequently and allowed the user not only a larger set of actions during a search but also improved the information available in the client regarding search progress and received data. While data partitioning did not improve the execution time it offered the user a first set of data quickly, not forcing the user to idly wait, making the user experience the system as fast. The results indicated that to increase the user‟s perceived Quality of Experience for systems with server-client communication, data partitioning offered several opportunities for improvement. / I programvaruteknik är det viktigt att överväga hur en potentiell användare upplever ett system vid användning. Ingen användare kommer att ha en tillfredsställande upplevelse om de uppfattar systemet som långsamt, icke responsivt, ostabilt eller döljande av information. Dessutom, om systemet binder användarna till ett begränsat antal möjliga handlingar, kommer deras upplevelse vidare försämras. För att utvärdera vilken påverkan dessa problem har på en användares upplevda kvalitet, används mätenheten Upplevd Tjänstekvalitet. I detta arbete var det huvudsakliga syftet att förbättra en användares upplevelse av ett system som led av de tidigare nämnda problemen vid sökning av större datamängder. För att uppnå detta syfte utvärderades systemet för att identifiera befintliga problem samt vilka som mest påverkade användares Upplevda Tjänstekvalitet. Systemet som utvärderades var en mjukvara för lagerhantering som utvecklades och underhölls av Aptean AB‟s kontor i Hässleholm, Sverige. Systemet bestod av flera klienter och en server som skickade data över ett nätverk. Systemet utvärderades med en fallstudie där prestandan anayserades tillsammans med en enkät utförd i samarbete med Apteans personal för att få insikt i hur systemet upplevdes vid sökningar av stora datamängder. Resultaten visade på tre problem som hade störst inverkan på den Upplevda Tjänstekvaliteten: (1) interaktion; begränsade antal möjliga handlingar under en sökning, (2) transparens; begränsad tillgång till information om sökningens progress samt den hämtade datan, (3) körningstid; slutförande av en sökning tog lång tid. Efter att systemet hade analyserats, implementerades hypotetiska teknologiska lösningar för att lösa de identifierade problemen. Den första lösningen delade in datan i ett flertal partitioner, den andra minskade datans storlek som skickades över nätverket genom att tillämpa komprimering och den tredje var en kombination av de två teknologierna. Efter implementationen utfördes en sista uppsättning mätningar tillsammans med enkäten för att jämföra lösningarna baserat på deras prestanda och förbättringar av Upplevd Tjänstekvalitet. Den mest signifikanta förbättringen av Upplevd Tjänstekvalitet kom från datapartitioneringslösningen. Trots att kombinationen av lösningar uppnådde en mindre vidare förbättring, var det primärt tack vare datapartitioneringen, vilket innebar att den teknologin var den mest passande lösningen till de identifierade problemen jämfört med komprimering, vilken visade på endast en liten förbättring av Upplevd Tjänstekvalitet. När data partitionerades kunde flera uppdateringar skickas och användaren tilläts ett större antal möjliga handlingar under en sökning, men också en förbättrad tillgång till information i klienten angående sökningens progress samt den hämtade datan. Trots att datapartitionering inte förbättrade körningstiden, kunde den erbjuda användaren en första mängd data snabbt utan att tvinga användaren att sysslolöst vänta, vilket gjorde att systemet upplevdes som snabbt. För att förbättra den Upplevda Tjänstekvaliteten för system med server-klient kommunikation visade resultaten att datapartitionering är en lösning som erbjöd flera möjligheter för förbättring. QoE Performance Optimization Data Partitioning Network Communication Upplevd Tjänstekvalitet Prestandaoptimering Datapartitionering Nätverkskommunikation Computer Sciences Datavetenskap (datalogi)
18	Partitioned Persistent Homology Malott, Nicholas O. January 2020 (has links) No description available. Computer Engineering Persistent Homology Data Reduction Topological Data Analysis Data Mining Data Partitioning Parallel and Distributed Computing
19	Partitioning XML data, towards distributed and parallel management / Méthode de Partitionnement pour le traitement distribué et parallèle de données XML. Malla, Noor 21 September 2012 (has links) Durant cette dernière décennie, la diffusion du format XML pour représenter les données générées par et échangées sur le Web a été accompagnée par la mise en œuvre de nombreux moteurs d’évaluation de requêtes et de mises à jour XQuery. Parmi ces moteurs, les systèmes « mémoire centrale » (Main-memory Systems) jouent un rôle très important dans de nombreuses applications. La gestion et l’intégration de ces systèmes dans des environnements de programmation sont très faciles. Cependant, ces systèmes ont des problèmes de passage à l’échelle puisqu’ils requièrent le chargement complet des documents en mémoire centrale avant traitement.Cette thèse présente une technique de partitionnement des documents XML qui permet aux moteurs « mémoire principale » d’évaluer des expressions XQuery (requêtes et mises à jour) pour des documents de très grandes tailles. Cette méthode de partitionnement s’applique à une classe de requêtes et mises à jour pertinentes et fréquentes, dites requêtes et mises à jour itératives.Cette thèse propose une technique d'analyse statique pour reconnaître les expressions « itératives ». Cette analyse statique est basée sur l’extraction de chemins à partir de l'expression XQuery, sans utilisation d'information supplémentaire sur le schéma. Des algorithmes sont spécifiés, utilisant les chemins extraits par l’étape précédente, pour partitionner les documents en entrée en plusieurs parties, de sorte que la requête ou la mise à jour peut être évaluée sur chaque partie séparément afin de calculer le résultat final par simple concaténation des résultats obtenus pour chaque partie. Ces algorithmes sont mis en œuvre en « streaming » et leur efficacité est validée expérimentalement.En plus, cette méthode de partitionnement est caractérisée également par le fait qu'elle peut être facilement implémentée en utilisant le paradigme MapReduce, permettant ainsi d'évaluer une requête ou une mise à jour en parallèle sur les données partitionnées. / With the widespread diffusion of XML as a format for representing data generated and exchanged over the Web, main query and update engines have been designed and implemented in the last decade. A kind of engines that are playing a crucial role in many applications are « main-memory » systems, which distinguish for the fact that they are easy to manage and to integrate in a programming environment. On the other hand, main-memory systems have scalability issues, as they load the entire document in main-memory before processing. This Thesis presents an XML partitioning technique that allows main-memory engines to process a class of XQuery expressions (queries and updates), that we dub « iterative », on arbitrarily large input documents. We provide a static analysis technique to recognize these expressions. The static analysis is based on paths extracted from the expression and does not need additional schema information. We provide algorithms using path information for partitioning the input documents, so that the query or update can be separately evaluated on each part in order to compute the final result. These algorithms admit a streaming implementation, whose effectiveness is experimentally validated. Besides enabling scalability, our approach is also characterized by the fact that it is easily implementable into a MapReduce framework, thus enabling parallel query/update evaluation on the partitioned data. XML Requêtes XQuery Mises à jour XQuery Projection Partitionnement de données MapReduce XML XQuery XQuery updates Projection Data partitioning MapReduce
20	PARTICIONAMENTO DE CONJUNTO DE DADOS E SELEÇÃO DE VARIÁVEIS EM PROBLEMAS DE CALIBRAÇÃO MULTIVARIADA Alves, André Luiz 22 September 2017 (has links) Submitted by admin tede (tede@pucgoias.edu.br) on 2017-11-22T13:39:54Z No. of bitstreams: 1 André Luiz Alves.pdf: 760209 bytes, checksum: 09b516d6ffcca2c7f66578b275613b36 (MD5) / Made available in DSpace on 2017-11-22T13:39:54Z (GMT). No. of bitstreams: 1 André Luiz Alves.pdf: 760209 bytes, checksum: 09b516d6ffcca2c7f66578b275613b36 (MD5) Previous issue date: 2017-09-22 / The objective of this work is to compare a proposed algorithm based on the RANdom SAmple Consensus (RANSAC) method for selection of samples, selection of variables and simultaneous selection of samples and variables with the Sucessive Projections Algorithm (SPA) from a chemical data set in the context of multivariate calibration. The proposed method is based on the RANSAC method and Multiple Linear Regression (MLR). The predictive capacity of the models is measured using the Root Mean Square Error of Prediction (RMSEP). The results allow to conclude that the Successive Projection Algorithm improves the predictive capacity of Ransac. It is concluded that the SPA positively influences the Ransac algorithm for selection of samples, for selection of variables and also for simultaneous selection of samples and variables. / O objetivo do trabalho é comparar um algoritmo proposto baseado no método consenso de amostra aleatória (RANdom SAmple Consensus, RANSAC) para seleção de amostras, seleção de variáveis e seleção simultânea de amostras e variáveis com o algoritmo de projeções sucessivas (Sucessive Projections Algorithm, SPA) a partir de conjuntos de dados químicos no contexto da calibração multivariada. O método proposto é baseado no método RANSAC e regressão linear múltipla (Multiple Linear Regression, MLR). A capacidade preditiva dos modelos é medida empregando o erro de previsão da raiz quadrada do erro quadrático médio (Root Mean Square Error Of Prediction, RMSEP). Os resultados permitem concluir que o Algoritmo das Projeções Sucessivas melhora a capacidade preditiva do Ransac. Conclui-se que o SPA influi positivamente no algoritmo Ransac para seleção de amostras, para seleção de variáveis e também para seleção simultânea de amostras e variáveis. ENGENHARIAS::ENGENHARIA DE PRODUCAO

Search results