Global ETD Search

71	Monitoring and control of distributed web services on cloud computing infrastructure / Παρακολούθηση και έλεγχος κατανεμημένων δικτυακών υπηρεσιών σε υπολογιστική αρχιτεκτονική νέφους Δεχουνιώτης, Δημήτριος 26 August 2014 (has links) This thesis concerns two main research areas of distributed web services deployed on cloud computing infrastructure. The first category is about monitoring of cloud computing infrastructure. In chapter 2 a novel general technique is used to infer relationships between different service components in a data center. This approach relies on a small set of fuzzy rules, produced by a hybrid genetic algorithm with high classification rate. Furthermore, the strength of detected dependencies is measured. Although we do not know the ground truth about relationships in a network, the proposed method mines realistic relationships without having any previous information about network topology and infrastructure. This approach can be a useful monitoring tool for administrators to obtain a clear view of what is happening in the underlying network. Finally, because of the simplicity of our algorithm and the flexibility of FIM, an online approach seems feasible. The second major problem, which is addressed in chapter 3, is the automated resource control of consolidated web applications on cloud computing infrastructure. ACRA is an innovative modeling and controlling technique of distributed services that are co-located on server cluster. The system dynamics are modeled by a group of linear state space models, which cover all the range of workload conditions. Because of the variant workload conditions, there are non-linear term and uncertainties which are modeled by an additive term in the local linear models. Due to the several types of service transactions with varying time and resources demands there are many desired candidate reference values of the SLOs during a day. Due to these requirements and the workload circumstances, we choose the appropriate model and we compute the closest feasible operating point according to several optimization criteria. Then using a set-theoretic technique a state feedback controller is designed that successfully leads and stabilize the system in the region of the equilibrium point. ACRA controller computes a positively invariant set on the state-space, which includes the target set and drives the system trajectories in it. Thus provide stability guarantee and high levels of robustness against system disturbances and nonlinearities. Furthermore we compare ACRA with an MPC and a PI controller and the results are very promising, since our solution outperforms the two other approaches. Secondly, a unified local level modeling and control framework for consolidated web services in a server cluster was presented, which can be a vital element of a holistic distributed control platform. Admission control and resource allocation were addressed as a common decision problem. Stability and constraint satisfaction was guaranteed. A real testbed was built and from a range of examples, in different operating conditions, we can conclude that both the identification scheme and controller provide high level of QoS. A novel component of this approach is the determination of a set of feasible operating (equilibrium) points which allows choosing the appropriate equilibrium point, depending only on what our objectives are, such as maximizing throughput, minimizing consumption or maximizing profit. Evaluation shows that our approach has high performance compared to well-known solutions, such as queuing models and measurement approach of equilibrium points. Both controllers succeed in their main targets respectively to the already proposed studies in literature. Firstly they satisfy the SLA requirements and the constraints of the underlying cloud computing infrastructure. To the best of our knowledge they are the only studies that calculate a set of feasible operating points that ensure system stability. Furthermore they adopt modern control theory and beyond the stability guarantee they introduce new control properties such as positively invariant sets , ultimate boundedness and e- contractive sets. / Στη παρούσα διδακτορική διατριβή δύο ερευνητικά θέματα επιλύονται. Αρχικά αναπτύσσεται μια τεχνική παρακολούθηση της δικτυακής κίνησης με σκοπό την εύρεση λειτουργικών σχέσεων μεταξύ των διάφορων μερών μιας δικτυακής εφαρμογής. Στο δεύτερο μέρος επιλύεται το πρόβλημα της αυτοματοποιημένη διανομής των πόρων σε δικτυακές εφαρμογές που μοιράζονται ένα κοινό περιβάλλον ΥΑΝ ( Υπολογιστική Αρχιτεκτονική Νέφους). Στόχος του πρώτου κεφαλαίου της διατριβής σε σχέση με την υπάρχουσα βιβλιογραφία είναι η δημιουργία ενός εργαλείου ανάλυσης της δικτυακής κίνησης έτσι ώστε να γίνονται κατανοητές οι λειτουργικές σχέσεις μεταξύ μερών των κατανεμημένων δικτυακών υπηρεσιών. Αυτός ο γράφος είναι πρωτεύον εργαλείο για πολλές εργασίες ενός διαχειριστή που εντάσσονται στο πεδίο της ανάλυσης της απόδοσης και της ανάλυσης των αρχικών αιτίων. Για παράδειγμα η ανίχνευση λανθασμένων εγκαταστάσεων ή διαδικτυακών επιθέσεων και ο σχεδιασμός για την επέκταση η μετατροπή των ΥΑΝ υποδομών. Το δεύτερο μέρος της παρούσας διατριβής ασχολείται με το θέμα της αυτοματοποιημένης κατανομής των υπολογιστικών πόρων ενός υπολογιστικού κέντρου ΥΑΝ σε ένα σύνολο εγκατεστημένων δικτυακών εφαρμογών. Η σύγχρονη τεχνολογία της εικονικοποίησης είναι ο κύριος παράγοντας για την «συστέγαση» πολλών κατανεμημένων υπηρεσιών σε υπολογιστικά κέντρα ΥΑΝ. Το ΕΑΚΠ (έλεγχος αποδοχής και κατανομή πόρων) είναι ένα αυτόνομο πλαίσιο μοντελοποίησης και ελέγχου, το οποίο παρέχει ακριβή μοντέλα και λύνει ενοποιημένα τα προβλήματα ΕΑ και ΚΠ των δικτυακών εφαρμογών που είναι συγκεντρωμένες σε υπολογιστικά κέντρα ΥΑΝ. Στόχος του ΕΑΚΠ είναι να μεγιστοποιεί την είσοδο των αιτήσεων των χρηστών στη παρεχόμενη υπηρεσία εκπληρώνοντας παράλληλα και τις προδιαγεγραμμένες απαιτήσεις ΠΥ (Ποιότητα Υπηρεσίας). Ο δεύτερος τοπικός ελεγκτής που παρουσιάζεται σε αυτή τη διατριβή είναι ένα αυτόνομο πλαίσιο μοντελοποίησης και ελέγχου κατανεμημένων δικτυακών εφαρμογών σε περιβάλλον ΥΑΝ, το οποίο λύνει συγχρόνως τα προβλήματα ΕΑ και ΚΠ με ενιαίο τρόπο. Cloud computing Network monitoring Control theory 004.678 2 Θεωρία ελέγχου
72	Ερωτήματα διαστημάτων σε περιβάλλοντα νεφών υπολογιστών Σφακιανάκης, Γεώργιος 04 February 2014 (has links) Τα νέφη υπολογιστών γίνονται ολοένα και πιο σημαντικά για εφαρμογές διαχείρισης δεδομένων, λόγω της δυνατότητας που προσφέρουν για διαχείριση πολύ μεγάλου όγκου δεδομένων. Καθημερινά προκύπτουν νέα προβλήματα, που η λύση τους απαιτεί αποδοτικές και κλιμακώσιμες εφαρμογές για την επεξεργασία αυτού του τεράστιου όγκου πληροφορίας. Κεντρικό ρόλο σε αυτόν τον τομέα κατέχουν τα συστήματα αποθήκευσης κλειδιού-τιμής σε νέφη υπολογιστών (cloud key-value stores), καθώς και συστήματα παράλληλης επεξεργασίας μεγάλης ποσότητας δεδομένων όπως το MapReduce. Τα ερωτήματα διαστημάτων εμφανίζονται συχνά σε πραγματικές εφαρμογές. Η εργασία αυτή ασχολείται με ερωτήματα διαστημάτων σε περιβάλλοντα νεφών υπολογιστών με κορυφαία εφαρμογή τα χρονικά ερωτήματα (temporal queries). Τέτοια ερωτήματα επικεντρώνονται συνήθως στο να απαντήσουν ποια γεγονότα συνέβησαν ή συνέβαιναν κατά την διάρκεια ενός χρονικού διαστήματος. ́Ομως τα παραδοσιακά συστήματα για τη διαχείριση τέτοιου είδους ερωτημάτων δεν μπορούν να αντεπεξέλθουν στον όγκο δεδομένων που παράγονται τη σημερινή εποχή από ορισμένες εφαρμογές, με αποτέλεσμα να μην υπάρχει μία αποδοτική λύση. Για να αντιμετωπιστεί το πρόβλημα αυτό προτείνεται η χρήση συστημάτων νεφών υπολογιστών, τέτοιων που θα καταστήσουν διαχειρίσιμο αυτόν τον τεράστιο όγκο δεδομένων. Τα υπάρχοντα, όμως, έως σήμερα συστήματα νεφών υπολογιστών δεν διαθέτουν τη δυνατότητα υποστήριξης τέτοιου είδους ερωτημάτων. Στην εργασία αυτή, αρχικά, μελετήθηκε το πρόβλημα και οι σχετικές λύσεις που είχαν προταθεί παλαιότερα, όπως πχ. τα δέντρα ευθυγράμμων τμημάτων (Segment trees). Αυτές οι δομές επιτρέπουν την απάντηση των ερωτημάτων που περιγράφονται παραπάνω με αποδοτικό τρόπο. Στη συνέχεια μελετήθηκε η δυνατότητα εφαρμογής τους σε περιβάλλοντα νεφών υπολογιστών, ενώ διερευνήθηκαν πιθανές εναλλακτικές λύσεις που θα εκμεταλλεύονται καλύτερα τις δυνατότητες που προσφέρουν τα συστήματα αυτά. Η μελέτη αυτή οδήγησε στην δημιουργία νέων δομών δεδομένων και αλγορίθμων, ή τροποποιήσεις των υπαρχόντων, που βοηθούν στην αποδοτική επίλυση του προβλήματος. Τέλος πραγματοποιήθηκε σύγκριση της απόδοσης των λύσεων και τον αλγορίθμων που προτείνονται με τις ήδη υπάρχουσες. Τα αποτελέσματα της σύγκρισης έδειξαν βελτίωση του χρόνου εκτέλεσης έως και μία τάξης μεγέθους σε μερικές περιπτώσεις. / The cloud is becoming increasingly more important for data management applications, as it can seamlessly handle huge amounts of data. New problems arise on a daily basis and can only be solved by the use of efficient and scalable applications that can process these data. Cloud key-value storage systems play a crucial role in this new field, along with systems like MapReduce that can distributedly process huge amounts of data. One of these problems appearing often is supporting interval queries, an efficient solution for which is lacking in the field of cloud key-value stores. This thesis deals with this problem, and more specifically with the problem of temporal queries. This kind of queries try to answer what happened during a specific time range. But in recent years there has been an explosion in how much data are produced from some applications, rendering traditional systems incapable of handling them. For handling this amount of data the use of cloud key-value stores is suggested. But these systems don't have any special functionality for enabling them to answer those queries. First, in this thesis, older solutions where studied, such as Segment Trees. These kinds of data structures can answer the queries described above in an efficient way. After that, it was studied whether these data structures can be deployed on top of cloud key-value stores, additionally other solutions were investigated that could take better advantage of these systems. Finally, the efficiency of these new methods is compared with those already existing. The comparisons results showed even an order of magnitude improvement on some occasions. Νέφη υπολογιστών 004.678 2 Cloud computing Distributed systems Interval queries
73	Achieving privacy-preserving distributed statistical computation Liu, Meng-Chang January 2012 (has links) The growth of the Internet has opened up tremendous opportunities for cooperative computations where the results depend on the private data inputs of distributed participating parties. In most cases, such computations are performed by multiple mutually untrusting parties. This has led the research community into studying methods for performing computation across the Internet securely and efficiently. This thesis investigates security methods in the search for an optimum solution to privacy- preserving distributed statistical computation problems. For this purpose, the nonparametric sign test algorithm is chosen as a case for study to demonstrate our research methodology. Two privacy-preserving protocol suites using data perturbation techniques and cryptographic primitives are designed. The first protocol suite, i.e. the P22NSTP, is based on five novel data perturbation building blocks, i.e. the random probability density function generation protocol (RpdfGP), the data obscuring protocol (DOP), the secure two-party comparison protocol (STCP), the data extraction protocol (DEP) and the permutation reverse protocol (PRP). This protocol suite enables two parties to efficiently and securely perform the sign test computation without the use of a third party. The second protocol suite, i.e. the P22NSTC, uses an additively homomorphic encryption scheme and two novel building blocks, i.e. the data separation protocol (DSP) and data randomization protocol (DRP). With some assistance from an on-line STTP, this protocol suite provides an alternative solution for two parties to achieve a secure privacy-preserving nonparametric sign test computation. These two protocol suites have been implemented using MATLAB software. Their implementations are evaluated and compared against the sign test computation algorithm on an ideal trusted third party model (TTP-NST) in terms of security, computation and communication overheads and protocol execution times. By managing the level of noise data item addition, the P22NSTP can achieve specific levels of privacy protection to fit particular computation scenarios. Alternatively, the P22NSTC provides a more secure solution than the P22NSTP by employing an on-line STTP. The level of privacy protection relies on the use of an additively homomorphic encryption scheme, DSP and DRP. A four-phase privacy-preserving transformation methodology has also been demonstrated; it includes data privacy definition, statistical algorithm decomposition, solution design and solution implementation. 004.678
74	Flots de liens pour la modélisation d'interactions temporelles et application à l'analyse de trafic IP / Link streams for modelling interactions over time and application to the analysis of ip traffic Viard, Tiphaine 29 September 2016 (has links) Les interactions sont partout : il peut s'agir de contacts entre individus, d'emails, d'appels téléphoniques, de trafic IP, d'achats en ligne, d'exécution de code, etc. Les interactions peuvent être dirigées, pondérées, enrichies d'informations supplémentaires, cependant, dans tous les cas, une interaction signifie que deux entités u et v ont interagi du temps b au temps e : par exemple, deux individus u et v se rencontrent du temps b au temps e, deux machines sur un réseau démarrent une session IP du temps b au temps e, deux personnes u et v se téléphonent du temps b au temps e, etc.Dans cette thèse, nous explorons une nouvelle approche visant à modéliser les interactions directement comme des flots de liens, c'est-à-dire des séquences de quadruplets (b,e,u,v) signifiant que u et v ont interagi du temps b au temps e. Nous posons les fondations du formalisme correspondant. Afin de valider notre travail théorique, nous nous concentrons sur l'analyse de trafic IP. Il est en effet crucial pour nous d'effectuer des aller-retours constants entre théorie et pratique : les cas pratiques doivent nourrir notre réflexion théorique, et, en retour, les outils formels doivent être conçus de façon à être appliqués de la manière la plus générale.Nous appliquons notre formalisme à l'analyse de trafic IP, dans le but de valider la pertinence de notre formalisme for l'analyse de trafic IP, ainsi que comme méthodologie de détection d'événements. Nous élaborons une méthode permettant d'identifier des événements recouvrant plusieurs échelles de temps, et l'appliquons à une trace de trafic issue du jeu de données MAWI. / Interactions are everywhere: in the contexts of face-to-face contacts, emails, phone calls, IP traffic, online purchases, running code, and many others. Interactions may be directed, weighted, enriched with supplementary information, yet the baseline remains: in all cases, an interaction means that two entities u and v interact together from time b to time e: for instance, two individuals u and v meet from time b to time e, two machines on a network start an IP session from time b to time e, two persons u and v phone each other from time b to time e, and so on.In this thesis, we explore a new approach consisting in modelling interactions directly as link streams, i.e. series of quadruplets ( b, e, u, v ) meaning that u and v interacted from time b to time e, and we develop the basis of the corresponding formalism. In order to guide and assess this fundamental work, we focus on the analysis of IP traffic. It is particularly important to us that we make both fundamental and applied progress: application cases should feed our theoretical thoughts, and formal tools are designed to have meaning on application cases in the most general way.We apply our framework to the analysis of IP traffic, with the aim of assessing the relevance of link streams for describing IP traffic as well as finding events inside the traffic. We devise a method to identify events at different scales, and apply it to a trace of traffic from the MAWI dataset. Flots de liens Trafic IP Interactions temporelles Modélisation Réseaux complexes Link streams IP traffic Time interactions 004.678
75	Advanced techniques for Web service query optimization / Techniques avancées pour l’optimisation de requêtes de services Web Benouaret, Karim 09 October 2012 (has links) De nos jours, nous assistons à l’émigration du Web de données vers le Web orienté services. L’amélioration des capacités et fonctionnalités des moteurs actuels de recherche sur le Web, par des techniques efficaces de recherche et de sélection de services, devient de plus en plus importante. Dans cette thèse, dans un premier temps, nous proposons un cadre de composition de services Web en tenant compte des préférences utilisateurs. Le modèle fondé sur la théorie des ensembles flous est utilisé pour représenter les préférences. L’approche proposée est basée sur une version étendue du principe d’optimalité de Pareto. Ainsi, la notion des top-k compositions est introduite pour répondre à des requêtes utilisateurs de nature complexe. Afin d’améliorer la qualité de l’ensemble des compositions retournées, un second filtre est appliqué à cet ensemble en utilisant le critère de diversité. Dans un second temps, nous avons considéré le problème de la sélection des services Web en présence de préférences émanant de plusieurs utilisateurs. Une nouvelle variante, appelée Skyline de services à majorité, du Skyline de services traditionnel est défini. Ce qui permet aux utilisateurs de prendre une décision « démocratique » conduisant aux services les plus appropriés. Un autre type de Skyline de services est également discuté dans cette thèse. Il s’agit d’un Skyline de Services de nature graduelle et se fonde sur une relation de dominance floue. Comme résultat, les services Web présentant un meilleur compromis entre les paramètres QoS sont retenus, alors que les services Web ayant un mauvais compromis entre les QoS sont exclus. Finalement, nous avons aussi absorbé le cas où les QoS décrivant les services Web sont entachés d’incertitude. La théorie des possibilités est utilisée comme modèle de l’incertain. Ainsi, un Skyline de Services possibilité est proposé pour permettre à l’utilisateur de sélectionner les services Web désirés en présence de QoS incertains. De riches expérimentations ont été conduites afin d’évaluer et de valider toutes les approches proposées dans cette thèse / As we move from a Web of data to a Web of services, enhancing the capabilities of the current Web search engines with effective and efficient techniques for Web services retrieval and selection becomes an important issue. In this dissertation, we present a framework that identifies the top-k Web service compositions according to the user fuzzy preferences based on a fuzzification of the Pareto dominance relationship. We also provide a method to improve the diversity of the top-k compositions. An efficient algorithm is proposed for each method. We evaluate our approach through a set of thorough experiments. After that, we consider the problem of Web service selection under multiple users preferences. We introduce a novel concept called majority service skyline for this problem based on the majority rule. This allows users to make a “democratic” decision on which Web services are the most appropriate. We develop a suitable algorithm for computing the majority service skyline. We conduct a set of thorough experiments to evaluate the effectiveness of the majority service skyline and the efficiency of our algorithm. We then propose the notion of α-dominant service skyline based on a fuzzification of Pareto dominance relationship, which allows the inclusion of Web services with a good compromise between QoS parameters, and the exclusion ofWeb services with a bad compromise between QoS parameters. We develop an efficient algorithm based on R-Tree index structure for computing efficiently the α-dominant service skyline. We evaluate the effectiveness of the α-dominant service skyline and the efficiency of the algorithm through a set of experiments. Finally, we consider the uncertainty of the QoS delivered by Web services. We model each uncertain QoS attribute using a possibility distribution, and we introduce the notion of pos-dominant service skyline and the notion of nec-dominant service skyline that facilitates users to select their desired Web services with the presence of uncertainty in their QoS. We then developappropriate algorithms to efficiently compute both the pos-dominant service skyline and nec-dominant service skyline. We conduct extensive sets of experiments to evaluate the proposed service skyline extensions and algorithms Skyline Top-k Préférences Sélection des services QoS Composition de services Skyline Top-k Preferences Service selection QoS Service composition 004.678
76	Modélisation sémantique du cloud computing : vers une composition de services DaaS à sémantique incertaine / Semantic modeling for cloud computing : toward Daas service composition with uncertain semantics Malki, Abdelhamid 23 April 2015 (has links) Avec l'émergence du mouvement Open Data, des centaines de milliers de sources de données provenant de divers domaines (e.g., santé, gouvernementale, statistique, etc.) sont maintenant disponibles sur Internet. Ces sources de données sont accessibles et interrogées via des services cloud DaaS, et cela afin de bénéficier de la flexibilité, l'interopérabilité et la scalabilité que les paradigmes SOA et Cloud Computing peuvent apporter à l'intégration des données. Dans ce contexte, les requêtes sont résolues par la composition de plusieurs services DaaS. Définir la sémantique des services cloud DaaS est la première étape vers l'automatisation de leur composition. Une approche intéressante pour définir la sémantique des services DaaS est de les décrire comme étant des vues sémantiques à travers une ontologie de domaine. Cependant, la définition de ces vues sémantiques ne peut pas être toujours faite avec certitude, surtout lorsque les données retournées par un service sont trop complexes. Dans cette thèse, nous proposons une approche probabiliste pour représenter les services DaaS à sémantique incertaine. Dans notre approche, un service DaaS dont la sémantique est incertaine est décrit par plusieurs vues sémantiques possibles, chacune avec une probabilité. Les services ainsi que leurs vues sémantiques possibles sont représentées dans un registre de services probabiliste (PSR). Selon les dépendances qui existent entre les services, les corrélations dans PSR peuvent être représentées par deux modèles différents : le modèle Bloc-indépendant-disjoint (BID), et le modèle à base des réseaux bayésiens. En se basant sur nos modèles probabilistes, nous étudions le problème de l'interprétation d'une composition existante impliquant des services à sémantique incertaine. Nous étudions aussi le problème de la réécriture de requêtes à travers les services DaaS incertains, et nous proposons des algorithmes efficaces permettant de calculer les différentes compositions possibles ainsi que leurs probabilités. Nous menons une série d'expérimentation pour évaluer la performance de nos différents algorithmes de composition. Les résultats obtenus montrent l'efficacité et la scalabilité de nos solutions proposées / With the emergence of the Open Data movement, hundreds of thousands of datasets from various concerns (e.g., healthcare, governmental, statistic, etc.) are now freely available on Internet. A good portion of these datasets are accessed and queried through Cloud DaaS services to benefit from the flexibility, the interoperability and the scalability that the SOA and Cloud Computing paradigms bring to data integration. In this context, user’s queries often require the composition of multiple Cloud DaaS services to be answered. Defining the semantics of DaaS services is the first step towards automating their composition. An interesting approach to define the semantics of DaaS services is by describing them as semantic views over a domain ontology. However, defining such semantic views cannot always be done with certainty, especially when the service’s returned data are too complex. In this dissertation, we propose a probabilistic approach to model the semantic uncertainty of data services. In our approach, a DaaS service with an uncertain semantics is described by several possible semantic views, each one is associated with a probability. Services along with their possible semantic views are represented in probabilistic service registry (PSR).According to the services dependencies, the correlations in PSR can be represented by two different models :Block-Independent-Disjoint model (noted BID), and directed probabilistic graphical model (Bayesian network). Based on our modeling, we study the problem of interpreting an existing composition involving services with uncertain semantics. We also study the problem of compositing uncertain DaaS services to answer a user query, and propose efficient methods to compute the different possible compositions and their probabilities. We conduct a series of experiments to evaluate the performance of our composition algorithms. The obtained results show the efficiency and the scalability of our proposed solutions Service Daas Vue sémantique Sémantique incertaine Sémantique corrélée Daas service Semantics Uncertain semantic Correlated semantic 004.678 2
77	Recommandation personnalisée hybride / Hybrid personalized recommendation Ben Ticha, Sonia 11 November 2015 (has links) Face à la surabondance des ressources et de l'information sur le net, l'accès aux ressources pertinentes devient une tâche fastidieuse pour les usagers de la toile. Les systèmes de recommandation personnalisée comptent parmi les principales solutions qui assistent l'utilisateur en filtrant les ressources, pour ne lui proposer que celles susceptibles de l’intéresser. L’approche basée sur l’observation du comportement de l’utilisateur à partir de ses interactions avec le e-services est appelée analyse des usages. Le filtrage collaboratif et le filtrage basé sur le contenu sont les principales techniques de recommandations personnalisées. Le filtrage collaboratif exploite uniquement les données issues de l’analyse des usages alors que le filtrage basé sur le contenu utilise en plus les données décrivant le contenu des ressources. Un système de recommandation hybride combine les deux techniques de recommandation. L'objectif de cette thèse est de proposer une nouvelle technique d'hybridation en étudiant les bénéfices de l'exploitation combinée d'une part, des informations sémantiques des ressources à recommander, avec d'autre part, le filtrage collaboratif. Plusieurs approches ont été proposées pour l'apprentissage d'un nouveau profil utilisateur inférant ses préférences pour l’information sémantique décrivant les ressources. Pour chaque approche proposée, nous traitons le problème du manque de la densité des données et le problème du passage à l’échelle. Nous montrons également, de façon empirique, un gain au niveau de la précision des recommandations par rapport à des approches purement collaboratives ou purement basées sur le contenu / Face to the ongoing rapid expansion of the Internet, user requires help to access to items that may interest her or him. A personalized recommender system filters relevant items from huge catalogue to particular user by observing his or her behavior. The approach based on observing user behavior from his interactions with the website is called usage analysis. Collaborative Filtering and Content-Based filtering are the most widely used techniques in personalized recommender system. Collaborative filtering uses only data from usage analysis to build user profile, while content-based filtering relies in addition on semantic information of items. Hybrid approach is another important technique, which combines collaborative and content-based methods to provide recommendations. The aim of this thesis is to present a new hybridization approach that takes into account the semantic information of items to enhance collaborative recommendations. Several approaches have been proposed for learning a new user profile inferring preferences for semantic information describing items. For each proposed approach, we address the sparsity and the scalability problems. We prove also, empirically, an improvement in recommendations accuracy against collaborative filtering and content-based filtering Recommandation personnalisée Filtrage collaboratif Contenu des ressources Profil sémantique de l’utilisateur Personalized recommendation Collaborative filtering Item content User semantic profile 004.678
78	Contributions au déploiement sécurisé de processus métiers dans le cloud / Contribution to the secure deployment of business processes in the cloud Ahmed Nacer, Amina 26 February 2019 (has links) L’évolution et l’accroissement actuels des technologies amènent les entreprises à vouloir se développer plus rapidement afin de rester compétitives et offrir des services à la pointe de la technologie, répondant aux besoins du marché. En effet, les entreprises étant sujettes à des changements assez fréquents requièrent un haut niveau de flexibilité et d’agilité. La gestion des processus métiers (BPM) leur permet dans ce sens de mieux appréhender et gérer leurs processus. Par ailleurs, l’apparition du Cloud Computing et de tous ses bénéfices (flexibilité et partage, coût optimisé, accessibilité garantie...etc) le rendent particulièrement attrayant. Ainsi, l’association de ces deux concepts permet aux entreprises de renflouer leur capital. Cependant, l’utilisation du cloud implique également de nouvelles exigences en terme de sécurité, qui découlent de son environnement partagé, et qui mettent un frein à sa large adoption. Le travail de cette thèse consiste à proposer des concepts et outils pour aider et guider les entreprises dans le déploiement de leurs processus dans un environnement cloud en toute sécurité. Une première contribution est un algorithme d’obfuscation permettant d’automatiser la décomposition et le déploiement des processus sans intervention humaine, en se basant sur la nature des fragments. Cet algorithme limite le taux d’informations sur chaque cloud à travers un ensemble de contraintes de séparation, permettant de déployer les fragments considérés comme étant sensibles sur différents clouds. La seconde contribution de cette thèse consiste à complexifier la structure du processus afin de limiter le risque de coalition de clouds. Ceci se fait à travers l’introduction de faux fragments à certains endroits stratégiques du processus. L’objectif étant de rendre les collaborations générées plus résistantes aux attaques, et par conséquent de réduire la probabilité de coalition. Même si les opérations d’obfuscation et de complexification protègent le savoir-faire des entreprises lors d’un déploiement cloud, un risque subsiste toujours. Dans ce contexte, cette thèse propose également un modèle de risque permettant d’évaluer et de quantifier les risques de sécurité auxquels restent exposés les processus après déploiement. L’objectif de ce modèle est de combiner les informations de sécurité avec d’autres dimensions de la qualité de service tel que le coût, pour la sélection de configurations optimisées. Les approches proposées sont implémentées et testées à travers différentes configurations de processus. Leur validité est vérifiée à travers un ensemble de métriques dont l’objectif est de mesurer la complexité des processus après l’opération d’obfuscation ainsi que le niveau de risque subsistant / The fast evolution and development of technologies lead companies to grow faster in order to remain competitive and to offer services which are at the cutting edge of technology, meeting today’s market needs. Indeed, companies that are subject to frequent changes require a high level of flexibility and agility. Business Process Management (BPM) allows them to better manage their processes. Moreover, the emergence of Cloud Computing and all its advantages (flexibility and sharing, optimized cost, guaranteed accessibility... etc) make it particularly attractive. Thus, the combination of these two concepts allows companies to refloat their capital. However, the use of the cloud also implies new requirements in term of security, which stem from its shared environment, and which slow down its widespread adoption. The objective of this thesis consists in proposing concepts and tools that help and guide companies to deploy safely their processes in a cloud environment. A first contribution is an obfuscation algorithm that automates the decomposition and deployment of processes without any human intervention, based on the nature of the fragments. This algorithm limits the rate of information on each cloud through a set of separation constraints, which allow to deploy fragments considered as sensitive on different clouds. The second contribution of this thesis consists in complicating the structure of the process in order to limit the risk of clouds coalition. This is done through the introduction of fake fragments at certain strategic points in the process. The goal is to make generated collaborations more resistant to attacks, and thus reducing the likelihood of coalition. Even if obfuscation and complexification operations protect companies’ know-how during a cloud deployment, a risk remains. In this context, this thesis also proposes a risk model for evaluating and quantifying the security risks to which the process remain exposed after deployment. The purpose of this model is to combine security information with other dimensions of quality of service such as cost, for the selection of optimized configurations. The proposed approaches are implemented and tested through different process configurations. Their validity is verified through a set of metrics, whose objective is to measure the complexity of the processes as well as the remaining risk level after obfuscation Cloud Computing Processus Métiers Obfuscation Gestion du risque Cloud Computing Business Processes Obfuscation Risk management 005.8 658.403 8011 004.678 2
79	Software Datapaths for Multi-Tenant Packet Processing / Plans de données logiciels pour les traitements réseaux en environnements partagés Chaignon, Paul 07 May 2019 (has links) En environnement multi-tenant, les réseaux s'appuient sur un ensemble de ressources matérielles partagées pour permettre à des applications isolés de communiquer avec leurs clients. Cette isolation est garantie par un ensemble de mécanismes à la bordure des réseaux: les mêmes serveurs hébergeant les machines virtuelles doivent notamment déterminer le destinataire approprié pour chaque paquet réseau, copier ces derniers entre zones mémoires isolées et supporter les tunnels permettant l'isolation du trafic lors de son transit sur le coeur de réseau. Ces différentes tâches doivent être accomplies avec aussi peu de ressources matérielles que possible, ces dernières étant tout d'abord destinées aux machines virtuelles. Dans un contexte d'intensification de la demande en haute performance sur les réseaux, les acteurs de l'informatique en nuage ont souvent recours à des équipements matériels spécialisés mais inflexibles, leur permettant d'atteindre les performances requises. Néanmoins, dans cette thèse, nous défendons la possibilité d'améliorer les performances significativement sans avoir recours à de tels équipements. Nous prônons, d'une part, une consolidation des fonctions réseaux au niveau de la couche de virtualisation et, d'autre part, une relocalisation de certaines fonctions réseaux hors des machines virtuelles. À cette fin, nous proposons Oko, un commutateur logiciel extensible qui facilite la consolidation des fonctions réseaux dans la couche de virtualisation. Oko étend les mécanismes de l'état de l'art permettant une mise en cache des règles de commutateurs, ceci afin de permettre une exécution des fonctions réseaux sous forme d'extensions au commutateur. De plus, les extensions sont isolées du coeur du commutateur afin d'empêcher des fautes dans les extensions d'impacter le reste du réseau et de faciliter une mise en place rapide et sûre de nouvelles fonctions réseaux. En permettant aux fonctions réseaux de s'exécuter au sein du commutateur logiciel, sans redirections vers des processus distincts, Oko diminue de moitié le coût lié à l'exécution des fonctions réseaux en moyenne. Notre seconde contribution vise à permettre une exécution de certaines fonctions réseaux en amont des machines virtuelles, au sein de la couche de virtualisation. L'exécution de ces fonctions réseaux hors des machines virtuelles permet d'importants gains de performance, mais lèvent des problématiques d'isolation. Nous réutilisons et améliorons la technique utilisé dans Oko pour isoler les fonctions réseaux et l'étendons avec un mécanisme de partage équitable du temps CPU entre les différentes fonctions réseaux relocalisées. / Multi-tenant networks enable applications from multiple, isolated tenants to communicate over a shared set of underlying hardware resources. The isolation provided by these networks is enforced at the edge: end hosts demultiplex packets to the appropriate virtual machine, copy data across memory isolation boundaries, and encapsulate packets in tunnels to isolate traffic over the datacenter's physical network. Over the last few years, the growing demand for high performance network interfaces has pressured cloud providers to build more efficient multi-tenant networks. While many turn to specialized, hard-to-upgrade hardware devices to achieve high performance, in this thesis, we argue that significant performance improvements are attainable in end-host multi-tenant networks, using commodity hardware. We advocate for a consolidation of network functions on the host and an offload of specific tenant network functions to the host. To that end, we design Oko, an extensible software switch that eases the consolidation of network functions. Oko includes an extended flow caching algorithm to support its runtime extension with limited overhead. Extensions are isolated from the software switch to prevent failures on the path of packets. By avoiding costly redirections to separate processes and virtual machines, Oko halves the running cost of network functions on average. We then design a framework to enable tenants to offload network functions to the host. Executing tenant network functions on the host promises large performance improvements, but raises evident isolation concerns. We extend the technique used in Oko to provide memory isolation and devise a mechanism to fairly share the CPU among offloaded network functions with limited interruptions. Réseau programmable Informatique des nuages Traitement des paquets NFV SDN Programmable network Cloud Packet processing NFV SDN 004.678 2
80	Une approche pour estimer l'influence dans les réseaux complexes : application au réseau social Twitter / An approach for influence estimatation in complex networks : application to the social network Twitter Azaza, Lobna 23 May 2019 (has links) L'étude de l'influence sur les réseaux sociaux et en particulier Twitter est un sujet de recherche intense. La détection des utilisateurs influents dans un réseau est une clé de succès pour parvenir à une diffusion d'information à large échelle et à faible coût, ce qui s'avère très utile dans le marketing ou les campagnes politiques. Dans cette thèse, nous proposons une nouvelle approche qui tient compte de la variété des relations entre utilisateurs afin d'estimer l'influence dans les réseaux sociaux tels que Twitter. Nous modélisons Twitter comme un réseau multiplexe hétérogène où les utilisateurs, les tweets et les objets représentent les noeuds, et les liens modélisent les différentes relations entre eux (par exemple, retweets, mentions et réponses). Le PageRank multiplexe est appliqué aux données issues de deux corpus relatifs au domaine politique pour classer les candidats selon leur influence. Si le classement des candidats reflète la réalité, les scores de PageRank multiplexe sont difficiles à interpréter car ils sont très proches les uns des autres.Ainsi, nous voulons aller au-delà d'une mesure quantitative et nous explorons comment les différentes relations entre les noeuds du réseau peuvent déterminer un degré d'influence pondéré par une estimation de la crédibilité. Nous proposons une approche, TwitBelief, basée sur la règle de combinaison conjonctive de la théorie des fonctions de croyance qui permet de combiner différents types de relations tout en exprimant l’incertitude sur leur importance relative. Nous expérimentons TwitBelief sur une grande quantité de données collectées lors des élections européennes de 2014 et de l'élection présidentielle française de 2017 et nous déterminons les candidats les plus influents. Les résultats montrent que notre modèle est suffisamment flexible pour répondre aux besoins des spécialistes en sciences sociales et que l'utilisation de la théorie des fonctions de croyances est pertinente pour traiter des relations multiples. Nous évaluons également l'approche sur l'ensemble de données CLEF RepLab 2014 et montrons que notre approche conduit à des résultats significatifs. Nous proposons aussi deux extensions de TwitBelief traitant le contenu des tweets. La première est l'estimation de la polarisation de l'influence sur le réseau Twitter en utilisant l'analyse des sentiments avec l'algorithme des forêts d'arbres décisionnels. La deuxième extension est la catégorisation des styles de communication dans Twitter, il s'agit de déterminer si le style de communication des utilisateurs de Twitter est informatif, interactif ou équilibré. / Influence in complex networks and in particular Twitter has become recently a hot research topic. Detecting most influential users leads to reach a large-scale information diffusion area at low cost, something very useful in marketing or political campaigns. In this thesis, we propose a new approach that considers the several relations between users in order to assess influence in complex networks such as Twitter. We model Twitter as a multiplex heterogeneous network where users, tweets and objects are represented by nodes, and links model the different relations between them (e.g., retweets, mentions, and replies).The multiplex PageRank is applied to data from two datasets in the political field to rank candidates according to their influence. Even though the candidates' ranking reflects the reality, the multiplex PageRank scores are difficult to interpret because they are very close to each other.Thus, we want to go beyond a quantitative measure and we explore how relations between nodes in the network could reveal about the influence and propose TwitBelief, an approach to assess weighted influence of a certain node. This is based on the conjunctive combination rule from the belief functions theory that allow to combine different types of relations while expressing uncertainty about their importance weights. We experiment TwitBelief on a large amount of data gathered from Twitter during the European Elections 2014 and the French 2017 elections and deduce top influential candidates. The results show that our model is flexible enough to consider multiple interactions combination according to social scientists needs or requirements and that the numerical results of the belief theory are accurate. We also evaluate the approach over the CLEF RepLab 2014 data set and show that our approach leads to quite interesting results. We also propose two extensions of TwitBelief in order to consider the tweets content. The first is the estimation of polarized influence in Twitter network. In this extension, sentiment analysis of the tweets with the algorithm of forest decision trees allows to determine the influence polarity. The second extension is the categorization of communication styles in Twitter, it determines whether the communication style of Twitter users is informative, interactive or balanced. Théorie des fonctions de croyance Réseaux multiplexes Réseaux sociaux Twitter Social networks Twitter Complex networks Belief functions teory 004.678

Search results