• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 2
  • 1
  • Tagged with
  • 16
  • 16
  • 5
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Gestion conjointe de ressources de communication et de calcul pour les réseaux sans fils à base de cloud / Joint communication and computation resources allocation for cloud-empowered future wireless networks

Oueis, Jessica 12 February 2016 (has links)
Cette thèse porte sur le paradigme « Mobile Edge cloud» qui rapproche le cloud des utilisateurs mobiles et qui déploie une architecture de clouds locaux dans les terminaisons du réseau. Les utilisateurs mobiles peuvent désormais décharger leurs tâches de calcul pour qu’elles soient exécutées par les femto-cellules (FCs) dotées de capacités de calcul et de stockage. Nous proposons ainsi un concept de regroupement de FCs dans des clusters de calculs qui participeront aux calculs des tâches déchargées. A cet effet, nous proposons, dans un premier temps, un algorithme de décision de déportation de tâches vers le cloud, nommé SM-POD. Cet algorithme prend en compte les caractéristiques des tâches de calculs, des ressources de l’équipement mobile, et de la qualité des liens de transmission. SM-POD consiste en une série de classifications successives aboutissant à une décision de calcul local, ou de déportation de l’exécution dans le cloud.Dans un deuxième temps, nous abordons le problème de formation de clusters de calcul à mono-utilisateur et à utilisateurs multiples. Nous formulons le problème d’optimisation relatif qui considère l’allocation conjointe des ressources de calculs et de communication, et la distribution de la charge de calcul sur les FCs participant au cluster. Nous proposons également une stratégie d’éparpillement, dans laquelle l’efficacité énergétique du système est améliorée au prix de la latence de calcul. Dans le cas d’utilisateurs multiples, le problème d’optimisation d’allocation conjointe de ressources n’est pas convexe. Afin de le résoudre, nous proposons une reformulation convexe du problème équivalente à la première puis nous proposons deux algorithmes heuristiques dans le but d’avoir un algorithme de formation de cluster à complexité réduite. L’idée principale du premier est l’ordonnancement des tâches de calculs sur les FCs qui les reçoivent. Les ressources de calculs sont ainsi allouées localement au niveau de la FC. Les tâches ne pouvant pas être exécutées sont, quant à elles, envoyées à une unité de contrôle (SCM) responsable de la formation des clusters de calculs et de leur exécution. Le second algorithme proposé est itératif et consiste en une formation de cluster au niveau des FCs ne tenant pas compte de la présence d’autres demandes de calculs dans le réseau. Les propositions de cluster sont envoyées au SCM qui évalue la distribution des charges sur les différentes FCs. Le SCM signale tout abus de charges pour que les FCs redistribuent leur excès dans des cellules moins chargées.Dans la dernière partie de la thèse, nous proposons un nouveau concept de mise en cache des calculs dans l’Edge cloud. Afin de réduire la latence et la consommation énergétique des clusters de calculs, nous proposons la mise en cache de calculs populaires pour empêcher leur réexécution. Ici, notre contribution est double : d’abord, nous proposons un algorithme de mise en cache basé, non seulement sur la popularité des tâches de calculs, mais aussi sur les tailles et les capacités de calculs demandés, et la connectivité des FCs dans le réseau. L’algorithme proposé identifie les tâches aboutissant à des économies d’énergie et de temps plus importantes lorsqu’elles sont téléchargées d’un cache au lieu d’être recalculées. Nous proposons ensuite d’exploiter la relation entre la popularité des tâches et la probabilité de leur mise en cache, pour localiser les emplacements potentiels de leurs copies. La méthode proposée est basée sur ces emplacements, et permet de former des clusters de recherche de taille réduite tout en garantissant de retrouver une copie en cache. / Mobile Edge Cloud brings the cloud closer to mobile users by moving the cloud computational efforts from the internet to the mobile edge. We adopt a local mobile edge cloud computing architecture, where small cells are empowered with computational and storage capacities. Mobile users’ offloaded computational tasks are executed at the cloud-enabled small cells. We propose the concept of small cells clustering for mobile edge computing, where small cells cooperate in order to execute offloaded computational tasks. A first contribution of this thesis is the design of a multi-parameter computation offloading decision algorithm, SM-POD. The proposed algorithm consists of a series of low complexity successive and nested classifications of computational tasks at the mobile side, leading to local computation, or offloading to the cloud. To reach the offloading decision, SM-POD jointly considers computational tasks, handsets, and communication channel parameters. In the second part of this thesis, we tackle the problem of small cell clusters set up for mobile edge cloud computing for both single-user and multi-user cases. The clustering problem is formulated as an optimization that jointly optimizes the computational and communication resource allocation, and the computational load distribution on the small cells participating in the computation cluster. We propose a cluster sparsification strategy, where we trade cluster latency for higher system energy efficiency. In the multi-user case, the optimization problem is not convex. In order to compute a clustering solution, we propose a convex reformulation of the problem, and we prove that both problems are equivalent. With the goal of finding a lower complexity clustering solution, we propose two heuristic small cells clustering algorithms. The first algorithm is based on resource allocation on the serving small cells where tasks are received, as a first step. Then, in a second step, unserved tasks are sent to a small cell managing unit (SCM) that sets up computational clusters for the execution of these tasks. The main idea of this algorithm is task scheduling at both serving small cells, and SCM sides for higher resource allocation efficiency. The second proposed heuristic is an iterative approach in which serving small cells compute their desired clusters, without considering the presence of other users, and send their cluster parameters to the SCM. SCM then checks for excess of resource allocation at any of the network small cells. SCM reports any load excess to serving small cells that re-distribute this load on less loaded small cells. In the final part of this thesis, we propose the concept of computation caching for edge cloud computing. With the aim of reducing the edge cloud computing latency and energy consumption, we propose caching popular computational tasks for preventing their re-execution. Our contribution here is two-fold: first, we propose a caching algorithm that is based on requests popularity, computation size, required computational capacity, and small cells connectivity. This algorithm identifies requests that, if cached and downloaded instead of being re-computed, will increase the computation caching energy and latency savings. Second, we propose a method for setting up a search small cells cluster for finding a cached copy of the requests computation. The clustering policy exploits the relationship between tasks popularity and their probability of being cached, in order to identify possible locations of the cached copy. The proposed method reduces the search cluster size while guaranteeing a minimum cache hit probability.
12

Application Server Mobility and 5G Core Network

Symeri, Ali January 2019 (has links)
With advancements in the mobile network architecture, from the Fourth Generation to the Fifth Generation, a vast number of new use cases becomes available. Many use cases require cloud-based services, where a service is deployed close to the user. For a user to communicate with a service, it connects to the mobile network base station, Fifth Generation Core network and then to the service. When the user changes physical location, the mobile network and the service must apply mobility techniques. This is to prevent tromboned traffic and provide low latency between user and service. When a handover occurs, so that a user’s attachment point to the mobile network is changed from the one base station to another and the User Plane Function changes, the cloud-based service may have to seamlessly move from one cloud to another as well. In this thesis, a Service mobility framework is proposed and implemented, which enables service live migration between edge clouds and it provides simple RESTful APIs. The evaluation of the framework shows that the proposed implementation adds low delays to the total migration time and the service downtime is also shown to be low in the case of video streaming with no service interruption. / Med framsteg i det mobila nätverkets arkitektur, sett från den Fjärde Generationen till den Femte Generationen, så blir nya användningsområden tillgängliga. Bland de nya användningsområdena inkluderas molnbaserade tjänster, där tjänster är placerade nära användare, dessutom har vissa områden behov av dessa molnbaserade tjänster. För att en användare ska kunna kommunicera med en tjänst så måste den först ansluta till det mobila nätverkets basstationer och sedan till Femte Generationens kärnnätverk, för att sedan kunna kommunicera med tjänsten. När användaren förflyttar sig från en plats till en annan, så måste det mobila nätverket och tjänsten tillämpa rörlighetstekniker, som förflyttning av tjänsten. Förflyttningen är för att förhindra trombonerad trafik och att förse låg latens mellan användare och tjänst. När en överlämning sker, d.v.s att en användares kopplingspunkt till det mobila nätverket ändras, från en basstation till en annan, och att User Plane Function ändras, så kan även den molnbaserade tjänsten förflytta sig sömlöst från ett moln till ett annat. I denna avhandling presenteras ett tjänströrlighetsramverk som möjliggör tjänströrlighet mellan moln och erbjuder enkla RESTfulla API:er. Evaluering av ramverket visar att implementationen bidrar med låga fördröjningar till den totala migrations tiden samt att tjänster med videoströmming har lågt driftstopp utan tjänstavbrott.
13

Design of an algorithm for edge-node resource orchestration within an Operator Platform / Design av en algoritm för orkestrering av kantnodsresurser inom en Operatorplatform

Olander Ålund, Simon January 2022 (has links)
The future of networking lies within the development of low-latency and reliable networks. This development poses increased demand on the presence of edge-nodes. For a network operator to provide a low-latency edge-node resource, the physical distance from antenna-to-user needs to be small. This in turn, requires the network operator to have wide coverage of their physical antennas. An alternative solution is for network operators to share their edge-nodes within a so-called Operator Platform (OP) to reduce the cost of expanding their physical presence. In this project Design Science Research (DSR) was used to design an artifact named Master Thesis Orchestrator (MTO), to address the issue of finding and delivering shared edge-node resources between operators. An abstracted model of a realistic scenario was adopted. This model was used in evaluating the performance of the design against a baseline solution. The MTO is a decentralised algorithm using a shared memory cache. The artifact also has a randomised component which is used to control the frequency of shared memory accesses. These design choices were chosen to improve the performance in terms of scalability. A simulation of the artifact and baseline was conducted using a testbed implemented with Kubernetes/minikube. By assessing the performance on different input sizes (number of edge-nodes), the following performance metrics was gathered: success-rate (accuracy), run-time, and amount of data transmitted. The results showed that the MTO produced an average accuracy of 36% (baseline=96.8%) in terms of successful/failed user requests. The performance regarding run-time and transmitted data, varied depending on the outcome of the request. The MTO’s worst-case performance occurs for failed matches, leading to performance akin to that of the baseline’s average performance. The best-case performance of the MTO showed improvements of run-time compared to the baseline solution. The data was validated through an Analysis of variance (ANOVA)-test and the distributions are significantly (α = 5%) different from each other. The designed artifact is however not better than the baseline solution on all analysed metrics. The designed algorithm is volatile in-terms of time-needed and accuracy, but resource efficient. The poor accuracy is a significant factor into the probability that the worst-case performance would occur resulting in a slow and unreliable solution. Nevertheless, in terms of scalability, the designed artifact is showing less severe growth-rate than that of the baseline. / Framtiden för nätverk ligger i utvecklingen av tillförlitliga nätverk med låg latenstid. Denna utveckling ställer ökade krav på förekomsten av så kallade kantnoder. För att en nätoperatör ska kunna tillhandahålla en kantnodsresurs med låg latenstid måste det fysiska avståndet från antenn till användare vara litet. Detta kräver i sin tur att nätoperatören bör ha stor täckning av sina fysiska antenner. Ett alternativ till detta är att nätoperatörer delar sina resurser inom en så kallad Operatörsplatform för att minska kostnaderna för utökning av sin fysiska antennärvaro. I det här projektet användes Design Science Research för att utforma en produkt vid namn Master Thesis Orchestrator (MTO) för att lösa problemet med att hitta och leverera kantnodresurser mellan operatörer. En abstrakt modell av ett realistiskt scenario skapades. Denna modell användes för att utvärdera designens prestanda i förhållande till en baslinjelösning. MTO är en decentraliserad algoritm som använder sig av en delad minnescache. Designen har också en slumpmässig komponent som används för att styra åtkomstsfrekvensen till det delade minnet. Dessa designval gjordes för att förbättra skalbarhetsprestandan. En simulering av algoritmen och baslinjelösningen genomfördes med hjälp av en testbädd som implementerades med Kubernetes/minikube. Genom att testa prestandan på olika ingångsstorlekar (antal kantnoder) samlades följande mätetal in: framgångkvot (noggrannhet), körtid och mängden överförd data. Resultaten visade att MTO gav en genomsnittlig noggrannhet på 36% (baslinje=96,8%) gällande lyckade/felaktiga matchningar. Prestandan när det gäller körtid och överförda data varierade beroende på resultatet av matchningar. MTO:s sämsta prestanda uppstår vid misslyckade matchningar, vilket leder till ett resultat som liknar baslinjelösningens genomsnittliga prestanda. MTO:s bästa prestanda visade förbättringar av körtiden jämfört med baslinjelösningen. Testerna validerades genom ett ANOVA-test och algoritmerna skiljer sig signifikant (α = 5%) från varandra. Den utformade produkten är dock inte bättre än den baslinjelösningen för alla analyserade mätvärden. Den utformade algoritmen är volatil när det gäller tidsåtgång och noggrannhet, men resurseffektiv. Den dåliga noggrannheten är en betydande faktor för sannolikheten att den värsta möjliga prestandan skulle inträffa, vilket leder till en långsam och opålitlig lösning. När det gäller skalbarhet uppvisar den utformade produkten dock en mindre allvarlig tillväxttakt än baslinjelösningen.
14

Novel neural architectures & algorithms for efficient inference

Kag, Anil 30 August 2023 (has links)
In the last decade, the machine learning universe embraced deep neural networks (DNNs) wholeheartedly with the advent of neural architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), transformers, etc. These models have empowered many applications, such as ChatGPT, Imagen, etc., and have achieved state-of-the-art (SOTA) performance on many vision, speech, and language modeling tasks. However, SOTA performance comes with various issues, such as large model size, compute-intensive training, increased inference latency, higher working memory, etc. This thesis aims at improving the resource efficiency of neural architectures, i.e., significantly reducing the computational, storage, and energy consumption of a DNN without any significant loss in performance. Towards this goal, we explore novel neural architectures as well as training algorithms that allow low-capacity models to achieve near SOTA performance. We divide this thesis into two dimensions: \textit{Efficient Low Complexity Models}, and \textit{Input Hardness Adaptive Models}. Along the first dimension, i.e., \textit{Efficient Low Complexity Models}, we improve DNN performance by addressing instabilities in the existing architectures and training methods. We propose novel neural architectures inspired by ordinary differential equations (ODEs) to reinforce input signals and attend to salient feature regions. In addition, we show that carefully designed training schemes improve the performance of existing neural networks. We divide this exploration into two parts: \textsc{(a) Efficient Low Complexity RNNs.} We improve RNN resource efficiency by addressing poor gradients, noise amplifications, and BPTT training issues. First, we improve RNNs by solving ODEs that eliminate vanishing and exploding gradients during the training. To do so, we present Incremental Recurrent Neural Networks (iRNNs) that keep track of increments in the equilibrium surface. Next, we propose Time Adaptive RNNs that mitigate the noise propagation issue in RNNs by modulating the time constants in the ODE-based transition function. We empirically demonstrate the superiority of ODE-based neural architectures over existing RNNs. Finally, we propose Forward Propagation Through Time (FPTT) algorithm for training RNNs. We show that FPTT yields significant gains compared to the more conventional Backward Propagation Through Time (BPTT) scheme. \textsc{(b) Efficient Low Complexity CNNs.} Next, we improve CNN architectures by reducing their resource usage. They require greater depth to generate high-level features, resulting in computationally expensive models. We design a novel residual block, the Global layer, that constrains the input and output features by approximately solving partial differential equations (PDEs). It yields better receptive fields than traditional convolutional blocks and thus results in shallower networks. Further, we reduce the model footprint by enforcing a novel inductive bias that formulates the output of a residual block as a spatial interpolation between high-compute anchor pixels and low-compute cheaper pixels. This results in spatially interpolated convolutional blocks (SI-CNNs) that have better compute and performance trade-offs. Finally, we propose an algorithm that enforces various distributional constraints during training in order to achieve better generalization. We refer to this scheme as distributionally constrained learning (DCL). In the second dimension, i.e., \textit{Input Hardness Adaptive Models}, we introduce the notion of the hardness of any input relative to any architecture. In the first dimension, a neural network allocates the same resources, such as compute, storage, and working memory, for all the inputs. It inherently assumes that all examples are equally hard for a model. In this dimension, we challenge this assumption using input hardness as our reasoning that some inputs are relatively easy for a network to predict compared to others. Input hardness enables us to create selective classifiers wherein a low-capacity network handles simple inputs while abstaining from a prediction on the complex inputs. Next, we create hybrid models that route the hard inputs from the low-capacity abstaining network to a high-capacity expert model. We design various architectures that adhere to this hybrid inference style. Further, input hardness enables us to selectively distill the knowledge of a high-capacity model into a low-capacity model by cleverly discarding hard inputs during the distillation procedure. Finally, we conclude this thesis by sketching out various interesting future research directions that emerge as an extension of different ideas explored in this work.
15

Quality of Service Aware Mechanisms for (Re)Configuring Data Stream Processing Applications on Highly Distributed Infrastructure / Mécanismes prenant en compte la qualité de service pour la (re)configuration d’applications de traitement de flux de données sur une infrastructure hautement distribuée

Da Silva Veith, Alexandre 23 September 2019 (has links)
Une grande partie de ces données volumineuses ont plus de valeur lorsqu'elles sont analysées rapidement, au fur et à mesure de leur génération. Dans plusieurs scénarios d'application émergents, tels que les villes intelligentes, la surveillance opérationnelle de grandes infrastructures et l'Internet des Objets (Internet of Things), des flux continus de données doivent être traités dans des délais très brefs. Dans plusieurs domaines, ce traitement est nécessaire pour détecter des modèles, identifier des défaillances et pour guider la prise de décision. Les données sont donc souvent rassemblées et analysées par des environnements logiciels conçus pour le traitement de flux continus de données. Ces environnements logiciels pour le traitement de flux de données déploient les applications sous-la forme d'un graphe orienté ou de dataflow. Un dataflow contient une ou plusieurs sources (i.e. capteurs, passerelles ou actionneurs); opérateurs qui effectuent des transformations sur les données (e.g., filtrage et agrégation); et des sinks (i.e., éviers qui consomment les requêtes ou stockent les données). Nous proposons dans cette thèse un ensemble de stratégies pour placer les opérateurs dans une infrastructure massivement distribuée cloud-edge en tenant compte des caractéristiques des ressources et des exigences des applications. En particulier, nous décomposons tout d'abord le graphe d'application en identifiant quelques comportements tels que des forks et des joints, puis nous le plaçons dynamiquement sur l'infrastructure. Des simulations et un prototype prenant en compte plusieurs paramètres d'application démontrent que notre approche peut réduire la latence de bout en bout de plus de 50% et aussi améliorer d'autres métriques de qualité de service. L'espace de recherche de solutions pour la reconfiguration des opérateurs peut être énorme en fonction du nombre d'opérateurs, de flux, de ressources et de liens réseau. De plus, il est important de minimiser le coût de la migration tout en améliorant la latence. Des travaux antérieurs, Reinforcement Learning (RL) et Monte-Carlo Tree Searh (MCTS) ont été utilisés pour résoudre les problèmes liés aux grands nombres d’actions et d’états de recherche. Nous modélisons le problème de reconfiguration d'applications sous la forme d'un processus de décision de Markov (MDP) et étudions l'utilisation des algorithmes RL et MCTS pour concevoir des plans de reconfiguration améliorant plusieurs métriques de qualité de service. / A large part of this big data is most valuable when analysed quickly, as it is generated. Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, and Internet of Things (IoT), continuous data streams must be processed under very short delays. In multiple domains, there is a need for processing data streams to detect patterns, identify failures, and gain insights. Data is often gathered and analysed by Data Stream Processing Engines (DSPEs).A DSPE commonly structures an application as a directed graph or dataflow. A dataflow has one or multiple sources (i.e., gateways or actuators); operators that perform transformations on the data (e.g., filtering); and sinks (i.e., queries that consume or store the data). Most complex operator transformations store information about previously received data as new data is streamed in. Also, a dataflow has stateless operators that consider only the current data. Traditionally, Data Stream Processing (DSP) applications were conceived to run in clusters of homogeneous resources or on the cloud. In a cloud deployment, the whole application is placed on a single cloud provider to benefit from virtually unlimited resources. This approach allows for elastic DSP applications with the ability to allocate additional resources or release idle capacity on demand during runtime to match the application requirements.We introduce a set of strategies to place operators onto cloud and edge while considering characteristics of resources and meeting the requirements of applications. In particular, we first decompose the application graph by identifying behaviours such as forks and joins, and then dynamically split the dataflow graph across edge and cloud. Comprehensive simulations and a real testbed considering multiple application settings demonstrate that our approach can improve the end-to-end latency in over 50% and even other QoS metrics. The solution search space for operator reassignment can be enormous depending on the number of operators, streams, resources and network links. Moreover, it is important to minimise the cost of migration while improving latency. Reinforcement Learning (RL) and Monte-Carlo Tree Search (MCTS) have been used to tackle problems with large search spaces and states, performing at human-level or better in games such as Go. We model the application reconfiguration problem as a Markov Decision Process (MDP) and investigate the use of RL and MCTS algorithms to devise reconfiguring plans that improve QoS metrics.
16

DISTRIBUTED MACHINE LEARNING OVER LARGE-SCALE NETWORKS

Frank Lin (16553082) 18 July 2023 (has links)
<p>The swift emergence and wide-ranging utilization of machine learning (ML) across various industries, including healthcare, transportation, and robotics, have underscored the escalating need for efficient, scalable, and privacy-preserving solutions. Recognizing this, we present an integrated examination of three novel frameworks, each addressing different aspects of distributed learning and privacy issues: Two Timescale Hybrid Federated Learning (TT-HF), Delay-Aware Federated Learning (DFL), and Differential Privacy Hierarchical Federated Learning (DP-HFL). TT-HF introduces a semi-decentralized architecture that combines device-to-server and device-to-device (D2D) communications. Devices execute multiple stochastic gradient descent iterations on their datasets and sporadically synchronize model parameters via D2D communications. A unique adaptive control algorithm optimizes step size, D2D communication rounds, and global aggregation period to minimize network resource utilization and achieve a sublinear convergence rate. TT-HF outperforms conventional FL approaches in terms of model accuracy, energy consumption, and resilience against outages. DFL focuses on enhancing distributed ML training efficiency by accounting for communication delays between edge and cloud. It also uses multiple stochastic gradient descent iterations and periodically consolidates model parameters via edge servers. The adaptive control algorithm for DFL mitigates energy consumption and edge-to-cloud latency, resulting in faster global model convergence, reduced resource consumption, and robustness against delays. Lastly, DP-HFL is introduced to combat privacy vulnerabilities in FL. Merging the benefits of FL and Hierarchical Differential Privacy (HDP), DP-HFL significantly reduces the need for differential privacy noise while maintaining model performance, exhibiting an optimal privacy-performance trade-off. Theoretical analysis under both convex and nonconvex loss functions confirms DP-HFL’s effectiveness regarding convergence speed, privacy performance trade-off, and potential performance enhancement with appropriate network configuration. In sum, the study thoroughly explores TT-HF, DFL, and DP-HFL, and their unique solutions to distributed learning challenges such as efficiency, latency, and privacy concerns. These advanced FL frameworks have considerable potential to further enable effective, efficient, and secure distributed learning.</p>

Page generated in 0.1277 seconds