Spelling suggestions: "subject:"endtoend monitoring"" "subject:"end_to_end monitoring""
1 |
CheesePi: Delay Characterization through TCP-based Analysis from End-to-End MonitoringPortelli, Rebecca January 2016 (has links)
With increasing access to interconnected IP networks, people demand a faster response time from Internet services. Traffic from web browsing, the second most popular service, is particularly time-sensitive. This demands reliability and a guarantee of delivery with a good quality of service from ISPs. Additionally, the majority of the population do not have the technical background to monitor the delay themselves from their home networks, and their ISPs do not have a vantage point to monitor and diagnose network problems from the users’ perspective. Hence, the aim for this research was to characterise the “in-protocol” network delay encountered during web browsing from within a LAN. This research presents TCP traffic monitoring performed on a client device as well as TCP traffic monitoring over both the client-end and the server-end devices separately observing an automated web client/server communication. This was followed by offline analysis of the captured traces where each TCP flow was dissected into: handshake, data transfer, and teardown phases. The aim behind such extraction was to enable characterisation of network round-trip delay as well as network physical delay, end host processing delay, web transfer delay, and packets lost as perceived by the end hosts during data transfer. The outcome of measuring from both end devices showed that monitoring from both ends of a client/server communication results to a more accurate measurement of the genuine delay encountered when packets traverse the network than when measuring from the client-end only. Primarily, this was concluded through the ability to distinguish between the pure network delay and the kernel processing delay experienced during the TCP handshake and teardown. Secondly, it was confirmed that the two RTTs identified in a TCP handshake are not symmetrical and that a TCP teardown RTT takes longer than the TCP handshake RTT within the same TCP flow since a server must take measures to avoid SYN flooding attacks. Thirdly, by monitoring from both end devices, it was possible to identify routing path asymmetries by calculating the physical one-way delay a packet using the forward path in comparison to the physical delay of a packet using the reverse path. Lastly, by monitoring from both end devices, it is possible to distinguish between a packet that was actually lost and a packet that arrived with a higher delay than its subsequent packet during data transfer. Furthermore, utilizing TCP flows to measure the RTT delay excluding end host processing gave a better characterisation of the RTT delay as opposed to using ICMP traffic. / Med ökande tillgång till den sammankopplade IP-nätet, krävs det en snabbare responstid från Internettjänster. Trafik från surfning, den näst mest populära tjänsten är särskilt tidskänsliga. Detta kräver tillförlitlighet och en garanti för data leverans med en god servicekvalitet från Internetleverantörer. Dessutom har de flesta av befolkningen inte den tekniska bakgrunden för att övervaka fördröjning sig från sina hemmanätverk, och deras Internetleverantörer har ingen utsiktspunkt för att övervaka och diagnostisera nätverksproblem från användarnas perspektiv. Därför syftet med denna forskning är att karakterisera “in-protokoll” fördöljingen i nätet, som påträffas under surfning inifrån ett LAN. Denna forskning visar TCP-trafik monitoring som utförs på en klientenhet, samt separat TCP-trafik monitoring över både klient-end och serve-end enheter, för att observera en automatiserad webbklient / server-kommunikation. Detta följs av offline analys av de infångade tracer där varje TCP flöde dissekerades in: handskakning, dataöverföring, och nedkoppling faser. Syftet bakom sådan utvinning är att möjliggöra karakterisering av nätverk fördröjning samt nätverkets fysiska fördröjning, behandlingsfördröjning, webböverföringsfördröjning och förlorade paket som uppfattas av end-device under dataöverföring. The outcome of measuring from both end devices showed that monitoring from both ends of a client/server communication results to a more accurate measurement of the genuine delay encountered when packets traverse the network than when measuring from the client-end only. Primarily, this was concluded through the ability to distinguish between the pure network delay and the kernel processing delay experienced during the TCP handshake and teardown. Secondly, it was confirmed that the two RTTs identified in a TCP handshake are not symmetrical and that a TCP teardown RTT takes longer than the TCP handshake RTT within the same TCP flow since a server must take measures to avoid SYN flooding attacks. Thirdly, by monitoring from both end devices, it was possible to identify routing path asymmetries by calculating the physical one-way delay a packet using the forward path in comparison to the physical delay of a packet using the reverse path. Lastly, by monitoring from both end devices, it is possible to distinguish between a packet that was actually lost and a packet that arrived with a higher delay than its subsequent packet during data transfer. Furthermore, utilizing TCP flows to measure the RTT delay excluding end host processing gave a better characterisation of the RTT delay as opposed to using ICMP traffic. Resultatet av mätningarna från både slut-enheter visar att övervakning från båda ändar av en klient / server-kommunikation resulterar en noggrannare mätning av fördröjningar som uppstår när paketen färdas över nätverket än vid mätning från den enda klienten. Främst avslutades detta genom förmågan att skilja mellan den rena nätfördröjningen och kernel bearbetning under TCP handskakning och nedkoppling. För det andra bekräftades att de två RTT som identifierats i en TCP handskakning inte är symmetriska och att TCP nedkoppling RTT är längre än TCP handskakning RTT inom samma TCP flödet, eftersom servern måste vidta åtgärder för att undvika SYN översvämning attacker. För det tredje, genom att övervaka från båda avancerade enheter, var det möjligt att identifiera path asymmetrier genom att beräkna den fysiska envägsfördröjningen av ett paket på framåtriktade banan i jämförelse med den fysiska fördröjningen för ett paket på den omvända banan. Slutligen genom att övervaka från båda end enheter, är det möjligt att skilja mellan ett paket som faktiskt förlorades och ett paket som kom med en högre fördröjning än dess efterföljande paket under dataöverföring. Dessutom utnyttjande av TCP flöden för att mäta RTT exkluderat end-nod porocessering gav en bättre karakterisering av RTT fördröjning jämfört med att ICMP-trafik.
|
2 |
Efficient end-to-end monitoring for fault management in distributed systems / La surveillance efficace de bout-à-bout pour la gestion des pannes dans les systèmes distribuésFeng, Dawei 27 March 2014 (has links)
Dans cette thèse, nous présentons notre travail sur la gestion des pannes dans les systèmes distribués, avec comme motivation principale le suivi de fautes et de changements brusques dans de grands systèmes informatiques comme la grille et le cloud.Au lieu de construire une connaissance complète a priori du logiciel et des infrastructures matérielles comme dans les méthodes traditionnelles de détection ou de diagnostic, nous proposons d'utiliser des techniques spécifiques pour effectuer une surveillance de bout en bout dans des systèmes de grande envergure, en laissant les détails inaccessibles des composants impliqués dans une boîte noire.Pour la surveillance de pannes d'un système distribué, nous modélisons tout d'abord cette application basée sur des sondes comme une tâche de prédiction statique de collaboration (CP), et démontrons expérimentalement l'efficacité des méthodes de CP en utilisant une méthode de la max margin matrice factorisation. Nous introduisons en outre l’apprentissage actif dans le cadre de CP et exposons son avantage essentiel dans le traitement de données très déséquilibrées, ce qui est particulièrement utile pour identifier la class de classe de défaut de la minorité.Nous étendons ensuite la surveillance statique de défection au cas séquentiel en proposant la méthode de factorisation séquentielle de matrice (SMF). La SMF prend une séquence de matrices partiellement observées en entrée, et produit des prédictions comportant des informations à la fois sur les fenêtres temporelles actuelle et passé. L’apprentissage actif est également utilisé pour la SMF, de sorte que les données très déséquilibrées peuvent être traitées correctement. En plus des méthodes séquentielles, une action de lissage pris sur la séquence d'estimation s'est avérée être une astuce pratique utile pour améliorer la performance de la prédiction séquentielle.Du fait que l'hypothèse de stationnarité utilisée dans le surveillance statique et séquentielle devient irréaliste en présence de changements brusques, nous proposons un framework en ligne semi-Supervisé de détection de changement (SSOCD) qui permette de détecter des changements intentionnels dans les données de séries temporelles. De cette manière, le modèle statique du système peut être recalculé une fois un changement brusque est détecté. Dans SSOCD, un procédé hors ligne non supervisé est proposé pour analyser un échantillon des séries de données. Les points de changement ainsi détectés sont utilisés pour entraîner un modèle en ligne supervisé, qui fournit une décision en ligne concernant la détection de changement à parti de la séquence de données en entrée. Les méthodes de détection de changements de l’état de l’art sont utilisées pour démontrer l'utilité de ce framework.Tous les travaux présentés sont vérifiés sur des ensembles de données du monde réel. Plus précisément, les expériences de surveillance de panne sont effectuées sur un ensemble de données recueillies auprès de l’infrastructure de grille Biomed faisant partie de l’European Grid Initiative et le framework de détection de changement brusque est vérifié sur un ensemble de données concernant le changement de performance d'un site en ligne ayant un fort trafic. / In this dissertation, we present our work on fault management in distributed systems, with motivating application roots in monitoring fault and abrupt change of large computing systems like the grid and the cloud. Instead of building a complete a priori knowledge of the software and hardware infrastructures as in conventional detection or diagnosis methods, we propose to use appropriate techniques to perform end-To-End monitoring for such large scale systems, leaving the inaccessible details of involved components in a black box.For the fault monitoring of a distributed system, we first model this probe-Based application as a static collaborative prediction (CP) task, and experimentally demonstrate the effectiveness of CP methods by using the max margin matrix factorization method. We further introduce active learning to the CP framework and exhibit its critical advantage in dealing with highly imbalanced data, which is specially useful for identifying the minority fault class.Further we extend the static fault monitoring to the sequential case by proposing the sequential matrix factorization (SMF) method. SMF takes a sequence of partially observed matrices as input, and produces predictions with information both from the current and history time windows. Active learning is also employed to SMF, such that the highly imbalanced data can be coped with properly. In addition to the sequential methods, a smoothing action taken on the estimation sequence has shown to be a practically useful trick for enhancing sequential prediction performance.Since the stationary assumption employed in the static and sequential fault monitoring becomes unrealistic in the presence of abrupt changes, we propose a semi-Supervised online change detection (SSOCD) framework to detect intended changes in time series data. In this way, the static model of the system can be recomputed once an abrupt change is detected. In SSOCD, an unsupervised offline method is proposed to analyze a sample data series. The change points thus detected are used to train a supervised online model, which gives online decision about whether there is a change presented in the arriving data sequence. State-Of-The-Art change detection methods are employed to demonstrate the usefulness of the framework.All presented work is verified on real-World datasets. Specifically, the fault monitoring experiments are conducted on a dataset collected from the Biomed grid infrastructure within the European Grid Initiative, and the abrupt change detection framework is verified on a dataset concerning the performance change of an online site with large amount of traffic.
|
3 |
Efficient end-to-end monitoring for fault management in distributed systemsFeng, Dawei 27 March 2014 (has links) (PDF)
In this dissertation, we present our work on fault management in distributed systems, with motivating application roots in monitoring fault and abrupt change of large computing systems like the grid and the cloud. Instead of building a complete a priori knowledge of the software and hardware infrastructures as in conventional detection or diagnosis methods, we propose to use appropriate techniques to perform end-to-end monitoring for such large scale systems, leaving the inaccessible details of involved components in a black box.For the fault monitoring of a distributed system, we first model this probe-based application as a static collaborative prediction (CP) task, and experimentally demonstrate the effectiveness of CP methods by using the max margin matrix factorization method. We further introduce active learning to the CP framework and exhibit its critical advantage in dealing with highly imbalanced data, which is specially useful for identifying the minority fault class.Further we extend the static fault monitoring to the sequential case by proposing the sequential matrix factorization (SMF) method. SMF takes a sequence of partially observed matrices as input, and produces predictions with information both from the current and history time windows. Active learning is also employed to SMF, such that the highly imbalanced data can be coped with properly. In addition to the sequential methods, a smoothing action taken on the estimation sequence has shown to be a practically useful trick for enhancing sequential prediction performance.Since the stationary assumption employed in the static and sequential fault monitoring becomes unrealistic in the presence of abrupt changes, we propose a semi-supervised online change detection (SSOCD) framework to detect intended changes in time series data. In this way, the static model of the system can be recomputed once an abrupt change is detected. In SSOCD, an unsupervised offline method is proposed to analyze a sample data series. The change points thus detected are used to train a supervised online model, which gives online decision about whether there is a change presented in the arriving data sequence. State-of-the-art change detection methods are employed to demonstrate the usefulness of the framework.All presented work is verified on real-world datasets. Specifically, the fault monitoring experiments are conducted on a dataset collected from the Biomed grid infrastructure within the European Grid Initiative, and the abrupt change detection framework is verified on a dataset concerning the performance change of an online site with large amount of traffic.
|
Page generated in 0.0887 seconds