Global ETD Search

81	Managing high data availability in dynamic distributed derived data management system (D4M) under Churn Mirza, Ahmed Kamal January 2012 (has links) The popularity of decentralized systems is increasing day by day. These decentralized systems are preferable to centralized systems for many reasons, specifically they are more reliable and more resource efficient. Decentralized systems are more effective in the area of information management in the case when the data is distributed across multiple peers and maintained in a synchronized manner. This data synchronization is the main requirement for information management systems deployed in a decentralized environment, especially when data/information is needed for monitoring purposes or some dependent data artifacts rely upon this data. In order to ensure a consistent and cohesive synchronization of dependent/derived data in a decentralized environment, a dependency management system is needed. In a dependency management system, when one chunk of data relies on another piece of data, the resulting derived data artifacts can use a decentralized systems approach but must consider several critical issues, such as how the system behaves if any peer goes down, how the dependent data can be recalculated, and how the data which was stored on a failed peer can be recovered. In case of a churn (resulting from failing peers), how does the system adapt the transmission of data artifacts with respect to their access patterns and how does the system provide consistency management? The major focus of this thesis was to addresses the churn behavior issues and to suggest and evaluate potential solutions while ensuring a load balanced network, within the scope of a dependency information management system running in a decentralized network. Additionally, in peer-to-peer (P2P) algorithms, it is a very common assumption that all peers in the network have similar resources and capacities which is not true in real world networks. The peer‟s characteristics can be quite different in actual P2P systems; as the peers may differ in available bandwidth, CPU load, available storage space, stability, etc. As a consequence, peers having low capacities are forced to handle the same computational load which the high capacity peers handle, resulting in poor overall system performance. In order to handle this situation, the concept of utility based replication is introduced in this thesis to avoid the assumption of peer equality, enabling efficient operation even in heterogeneous environments where the peers have different configurations. In addition, the proposed protocol assures a load balanced network while meeting the requirement for high data availability, thus keeping the distributed dependent data consistent and cohesive across the network. Furthermore, an implementation and evaluation in the PeerfactSim.KOM P2P simulator of an integrated dependency management framework, D4M, was done. In order to benchmark the implementation of proposed protocol, the performance and fairness tests were examined. A conclusion is that the proposed solution adds little overhead to the management of the data availability in a distributed data management systems despite using a heterogeneous P2P environment. Additionally, the results show that the various P2P clusters can be introduced in the network based on peer‟s capabilities. / Populariteten av decentraliserade system ökar varje dag. Dessa decentraliserade system är att föredra framför centraliserade system för många anledningar, speciellt de är mer säkra och mer resurseffektiv. Decentraliserade system är mer effektiva inom informationshantering i fall när data delas ut över flera Peers och underhållas på ett synkroniserat sätt. Dessa data synkronisering är huvudkravet för informationshantering som utplacerade i en decentraliserad miljö, särskilt när data / information behövs för att kontrollera eller några beroende artefakter uppgifter lita på dessa data. För att säkerställa en konsistent och härstammar synkronisering av beroende / härledd data i en decentraliserad miljö, är ett beroende ledningssystem behövs. I ett beroende ledningssystem, när en bit av data som beror på en annan bit av data, kan de resulterande erhållna uppgifterna artefakter använd decentraliserad system approach, men måste tänka på flera viktiga frågor, såsom hur systemet fungerar om någon peer går ner, hur beroende data kan omräknas, och hur de data som lagrats på en felaktig peer kan återvinnas. I fall av churn (på grund av brist Peers), hur systemet anpassar sändning av data artefakter med avseende på deras tillgång mönster och hur systemet ger konsistens förvaltning? Den viktigaste fokus för denna avhandling var att behandlas churn beteende frågor och föreslå och bedöma möjliga lösningar samtidigt som en belastning välbalanserat nätverk, inom ramen för ett beroende information management system som kör i ett decentraliserade nätverket. Dessutom, i peerto- peer (P2P) algoritmer, är det en mycket vanlig uppfattning att alla Peers i nätverket har liknande resurser och kapacitet vilket inte är sant i verkliga nätverk. Peer egenskaper kan vara ganska olika i verkliga P2P system, som de Peers kan skilja sig tillgänglig bandbredd, CPU tillgängligt lagringsutrymme, stabilitet, etc. Som en följd, är peers har låg kapacitet tvingade att hantera sammaberäkningsbelastningen som har hög kapacitet peer hanterar vilket resulterar i dåligsystemets totala prestanda. För att hantera den här situationen, är begreppet verktygetbaserad replikering införs i denna uppsats att undvika antagandet om peer jämlikhet, så att effektiv drift även i heterogena miljöer där Peers har olika konfigurationer. Dessutom säkerställer det föreslagna protokollet en belastning välbalanserat nätverk med iakttagande kraven på hög tillgänglighet och därför hålla distribuerade beroende datakonsekvent och kohesiv över nätverket. Vidare ett genomförande och utvärdering iPeerfactSim.KOM P2P simulatorn av en integrerad beroende förvaltningsram, D4M, var gjort[.] De prestandatester och tester rättvisa undersöktes för att riktmärka genomförandet avföreslagna protokollet. En slutsats är att den föreslagna lösningen tillagt lite overhead för förvaltningen av tillgången till uppgifterna inom ett distribuerade system för datahantering, trots med användning av en heterogen P2P miljö. Dessutom visar resultaten att de olikaP2P-kluster kan införas i nätverket baserat på peer-möjligheter. heterogeneous peer-to-peer networks dependency management framework p2p Replication protocol p2p churn Computer Systems Datorsystem
82	Identificación del patrón de características del cliente Prime desertor de tarjeta de crédito del Banco BBVA Perú aplicando la metodología de la Ciencia de Datos / Identification of the pattern for Prime Credit Card Defector from BBVA Bank Peru. Applying the methodology of Data Science Huapaya Chura, Yaxira Sharajean, Velasquez Morales, Álvaro Gonzalo 10 December 2019 (has links) El presente trabajo tiene como objetivo encontrar el patrón de características del cliente Premium desertor de tarjetas de crédito, tomando como foco principal la oficina Chacarilla del banco BBVA puesto que, ayudará a identificar al cliente desertor usuario de tarjetas de crédito y, además podrá ser usada para mejorar la gestión del cliente y personalizar los productos según comportamiento. La metodología aplicada se basa en la ciencia de datos, tomando en cuenta diversos estudios de pronósticos de deserción, para luego correlacionar y analizar el conjunto de datos utilizado para este caso, que comprende 1174 datos. Así mismo, se valida las correlaciones e impacto significativo a la agencia para poder quedarnos con 217 clientes desertores, que pertenecen a una categoría premium. Así mismo, cabe mencionar que la deserción y fuga de usuarios de tarjetas de crédito incide con mayor frecuencia, a comparación de otros productos en todas las entidades financieras del Perú puesto que, las entidades bancarias ofrecen a los clientes mejores tasas y beneficios cada mes. / The purpose of this work is to find the pattern of characteristics of the Premium customer credit card defector, with the main focus of the Chacarilla office of the BBVA bank since, to identify the customer defending customer credit card user and, in addition, it can easily be to improve customer management and customize products based on behaviour. The methodology applied is based on data science, taking into account various studies of attrition analysis, to then correlate and analyse the set of data used for this case, which comprises 1174 of data. Likewise, the correlations and the significant impact on the agency are validated to be able to keep 217 defending clients, who belong to a premium category. Likewise, it is worth mentioning that the defection and leakage of credit card users affects more frequently, a comparison of other products in all financial entities of Peru since, banking entities offer customers better rates and benefits every month. / Trabajo de investigación Ciencia de datos Tarjetas de crédito Deserción de clientes Bancos Data science Credit cards Customer churn Banks
83	Predicting Customer Churn in E-commerce Using Statistical Modeling and Feature Importance Analysis : A Comparison of Random Forest and Logistic Regression Approaches Rudälv, Amanda January 2023 (has links) While operating in online markets offers opportunities for expanded assortment and convenience, it also poses challenges such as increased competition and the need to build personal relationships with customers. Customer retention be- comes crucial in maintaining a successful business, emphasizing the importance of understanding customer behavior. Traditionally, customer behavior analysis has focused on transactional behavior, such as purchase frequency and spending amounts. However, there has been a shift towards non-transactional behavior, driven by the popularity of loyalty programs that reward customers beyond trans- actions and aim to make customers feel appreciated and included, regardless of their spending power. This study is conducted at a global retailer with the aim of enhancing the under- standing of how non-transactional customer behavior influences customer churn. The approach in this study is to understand such behavior by developing a statis- tical model and to analyze statistical approaches of feature importance. Two types of approaches for statistical modeling, each with four variations, are assessed: (1) Random forest; and (2) Logistic regression. Furthermore, three different feature importance methods are considered; (1) Gini importance; (2) Permutation impor- tance and (3) Coefficient importance. The results showed that this approach can be used to analyze customer behavior and gain a better understanding of the driving factors for churn. Furthermore, the results showed that random forest approaches outperform logistic regression. With the definition of churn constructed in this study, the most important factors that affect the probability of churn are the customer’s number of sessions and inter session interval. / Att bedriva e-handel erbjuder inte enbart möjligheter för utökat sortiment och bekvämlighet, utan leder även till ökad konkurrens och ett ökat behov av att bygga relationer med kunder. Kundlojalitet är därmed avgörande för att upprätthålla en framgångsrik verksamhet, och betonar vikten av att förstå kundernas beteende. Traditionellt har analyser av kundbeteende främst bedrivits med fokus på transak- tionellt beteende, såsom frekvens eller totalbelopp för köp. På senare tid har allt mer fokus lagts på icke-transaktionellt beteende, på grund av införandet av lo- jalitetsprogram som belönar kunder bortom transaktioner, med målet att kunder ska känna sig uppskattade och inkluderade, oavsett köpkraft. Denna studie genomförs hos ett globalt detaljhandelsföretag med målet att utöka förståelsen för hur icke-transaktionellt kundbeteende påverkar kundbortfall. För att uppnå detta konstrueras en statistisk modell som utnyttjas för att med hjälp av statistiska metoder analysera signifikans hos variabler. Två kategorier av statis- tiska modeller undersöks; (1) Random forest och (2) Logistisk regression. Utöver detta används tre olika metoder för att analysera signifikans hos variabler; (1) Gini-betydelse; (2) Permutationsbetydelse; och (3) Koefficientbetydelse. Resultatet visar att studiens tillvägagångssätt kan användas för att analysera kund- beteende och nå ökad förståelse för vad som driver kundbortfall. Vidare visar re- sultatet att random forest-modeller överträffar modeller baserade på logistisk re- gression. Baserat på den definition av kundbortfall som definierats i denna studie är de viktigaste faktorerna som påverkar sannolikheten för kundbortfall, kundens antal sessioner och intervallet mellan kundens sessioner. Customer behavior E-commerce Churn prediction Statistical model Machine learning Random forest Logistic regression Feature importance Kundbeteende E-handel Kundbortfall Statistisk modell Maskininlärning Random forest Logistisk regression Variabelsignifikans Mathematics Matematik
84	Predicting and Explaining Customer Churn for an Audio/e-book Subscription Service using Statistical Analysis and Machine Learning / Prediktion och förklaring av kundbortfall för en prenumerationstjänst för ljud- och e-böcker med användning av statistik analys och maskininlärning Barr, Kajsa, Pettersson, Hampus January 2019 (has links) The current technology shift has contributed to increased consumption of media and entertainment through various mobile devices, and especially through subscription based services. Storytel is a company offering a subscription based streaming service for audio and e-books, and has grown rapidly in the last couple of years. However, when operating in a competitive market, it is of great importance to understand the behavior and demands of the customer base. It has been shown that it is more profitable to retain existing customers than to acquire new ones, which is why a large focus should be directed towards preventing customers from leaving the service, that is preventing customer churn. One way to cope with this problem is by applying statistical analysis and machine learning in order to identify patterns and customer behavior in data. In this thesis, the models logistic regression and random forest are used with an aim to both predict and explain churn in early stages of a customer's subscription. The models are tested together with the feature selection methods Elastic Net, RFE and PCA, as well as with the oversampling method SMOTE. One main finding is that the best predictive model is obtained by using random forest together with RFE, producing a prediction score of 0.2427 and a recall score of 0.7699. The other main finding is that the explanatory model is given by logistic regression together with Elastic Net, where significant regression coefficient estimates can be used to explain patterns associated with churn and give useful findings from a business perspective. / Det pågående teknologiskiftet har bidragit till en ökad konsumtion av digital media och underhållning via olika typer av mobila enheter, t.ex. smarttelefoner. Storytel är ett företag som erbjuder en prenumerationstjänst för ljud- och e-böcker och har haft en kraftig tillväxt de senaste åren. När företag befinner sig i en konkurrensutsatt marknad är det av stor vikt att förstå sig på kunders beteende samt vilka krav och önskemål kunder har på tjänsten. Det har nämligen visat sig vara mer lönsamt att behålla existerande kunder i tjänsten än hela tiden värva nya, och det är därför viktigt att se till att en befintlig kund inte avslutar sin prenumeration. Ett sätt att hantera detta är genom att använda statistisk analys och maskininlärningsmetoder för att identifiera mönster och beteenden i data. I denna uppsats används både logistisk regression och random forest med syfte att både prediktera och förklara uppsägning av tjänsten i ett tidigt stadie av en kunds prenumeration. Modellerna testas tillsammans med variabelselektionsmetoderna Elastic Net, RFE och PCA, samt tillsammans med översamplingsmetoden SMOTE. Resultatet blev att random forest tillsammans med RFE bäst predikterade uppsägning av tjänsten med 0.2427 i måttet precision och 0.7699 i måttet recall. Ett annat viktigt resultat är att den förklarande modellen ges av logistisk regression tillsammans med Elastic Net, där signifikanta estimat av regressionskoefficienterna ökar förklaringsgraden för beteenden och mönster relaterade till kunders uppsägning av tjänsten. Därmed ges användbara insikter ur ett företagsperspektiv. Statistics Machine learning customer churn random forest logistic regression Statistik Maskininlärning random forest logistisk regression kundbortfall Probability Theory and Statistics Sannolikhetsteori och statistik
85	Customer acquisition and onboarding at an online grocery company Borg, Ida January 2022 (has links) The master thesis is carried out in a collaboration with a Swedish online grocery company. The goal of the thesis is to investigate if it is possible to explain the underlying factors that affect new customers to be retained. Because of the difficulties of defining churn and retention in non-contractual settings, most of the literature is focused on contractual and subscription settings. There are a limited number of studies when trying to predict customer churn in non-contractual businesses and even fewer studies that emphasize retention. This thesis aims to contribute to the field of retention in non-contractual business and also highlight the assumptions and drawbacks of churn-related task. To achieve the goal of the thesis a literature review is carried out together with two statistical learning approaches; logistic regression model and extreme gradient boosting model. The results shows that it is possible to find the underlying factors that drive customers to be retained. The greatest drivers that could increase the probability of retaining new customers are the days between the first and second order, the second order value, and the total order value. / Examensarbetet är genomfört som ett samarbete med ett svenskt matvaruföretag på nätet. Målet med examensarbetet är att undersöka om det är möjligt att förklara de bakomliggande faktorer som påverkar nya kunder att stanna kvar som kunder. På grund av svårigheterna med att definiera kundbortfall och bibehållande av kunder i icke-kontraktuella affärer fokuserar den mesta av litteraturen på avtals- och prenumerationsmiljöer. Det finns ett begränsat antal studier där man försöker förutsäga kundbortfall i icke-kontraktuella verksamheter och ännu färre studier som fokuserar på bibehållande av kunder. Denna uppsats syftar till att bidra till området bibehållande av kunder i icke-kontraktuella affärer och även belysa antagandena och nackdelarna med analyser inom kundbortfall. För att uppnå målet med avhandlingen genomförs en litteraturgenomgång tillsammans med två statistiska lärandemetoder; logistisk regressionsmodell och extreme gradient boosting model. Resultaten visar att det är fullt möjligt att hitta de bakomliggande faktorerna som driver kunderna att stanna kvar. De största drivkrafterna som kan öka sannolikheten för att kunder ska bibehållas är dagarna mellan första och andra ordern, andra ordervärdet och det totala ordervärdet. retention churn customer acquisition customer onboarding logistic regression extreme gradient boosting model bibehållande av kunder kundbortfall kundförvärv kundonboarding logistisk regression exteme gradient boosting model Mathematics Matematik
86	Predicting user churn using temporal information : Early detection of churning users with machine learning using log-level data from a MedTech application / Förutsägning av användaravhopp med tidsinformation : Tidig identifiering av avhoppande användare med maskininlärning utifrån systemloggar från en medicinteknisk produkt Marcus, Love January 2023 (has links) User retention is a critical aspect of any business or service. Churn is the continuous loss of active users. A low churn rate enables companies to focus more resources on providing better services in contrast to recruiting new users. Current published research on predicting user churn disregards time of day and time variability of events and actions by feature selection or data preprocessing. This thesis empirically investigates the practical benefits of including accurate temporal information for binary prediction of user churn by training a set of Machine Learning (ML) classifiers on differently prepared data. One data preparation approach was based on temporally sorted logs (log-level data set), and the other on stacked aggregations (aggregated data set) with additional engineered temporal features. The additional temporal features included information about relative time, time of day, and temporal variability. The inclusion of the temporal information was evaluated by training and evaluating the classifiers with the different features on a real-world dataset from a MedTech application. Artificial Neural Networks (ANNs), Random Forrests (RFs), Decision Trees (DTs) and naïve approaches were applied and benchmarked. The classifiers were compared with among others the Area Under the Receiver Operating Characteristics Curve (AUC), Positive Predictive Value (PPV) and True Positive Rate (TPR) (a.k.a. precision and recall). The PPV scores the classifiers by their accuracy among the positively labeled class, the TPR measures the recognized proportion of the positive class, and the AUC is a metric of general performance. The results demonstrate a statistically significant value of including time variation features overall and particularly that the classifiers performed better on the log-level data set. An ANN trained on temporally sorted logs performs best followed by a RF on the same data set. / Bevarande av användare är en kritisk aspekt för alla företag eller tjänsteleverantörer. Ett lågt användarbortfall gör det möjligt för företag att fokusera mer resurser på att tillhandahålla bättre tjänster istället för att rekrytera nya användare. Tidigare publicerad forskning om att förutsäga användarbortfall bortser från tid på dygnet och tidsvariationer för loggad användaraktivitet genom val av förbehandlingsmetoder eller variabelselektion. Den här avhandlingen undersöker empiriskt de praktiska fördelarna med att inkludera information om tidsvariabler innefattande tid på dygnet och tidsvariation för binär förutsägelse av användarbortfall genom att träna klassificerare på data förbehandlat på olika sätt. Två förbehandlingsmetoder används, en baserad på tidssorterade loggar (loggnivå) och den andra på packade aggregeringar (aggregerat) utökad med framtagna tidsvariabler. Inklusionen av tidsvariablerna utvärderades genom att träna och utvärdera en uppsättning MLklassificerare med de olika tidsvariablerna på en verklig datamängd från en digital medicinskteknisk produkt. ANNs, RFs, DTs och naiva tillvägagångssätt tillämpades och jämfördes på den aggregerade datamängden med och utan tidsvariationsvariablerna och på datamängden på loggnivå. Klassificerarna jämfördes med bland annat AUC, PPV och TPR. PPV betygsätter algoritmerna efter träffsäkerhet bland den positivt märkta klassen och TPR utvärderar hur stor del av den positiva klassen som identifierats medan AUC är ett mått av klassificerarnas allmänna prestanda. Resultaten visar ett betydande värde av att inkludera tidsvariationsvariablerna överlag och i synnerhet att klassificerarna presterade bättre på datauppsättningen på loggnivå. Ett ANN tränad på tidssorterade loggar presterar bäst följt av en RF på samma datamängd. User churn Customer attrition Artificial neural networks Log-level analysis Random forests Decision trees Användarbortfall Kundbortfall Artificiella neurala nätverk logganalys Slumpskogar Beslutsträd Computer and Information Sciences Data- och informationsvetenskap
87	BUILDING ORACLES FOR ROBUST ALGORITHM DESIGN Foreback, Dianne R. 16 July 2015 (has links) No description available. Computer Science failure detectors oracles Omega self-stabilization churn fault-tolerance distributed algorithms distributed systems peer-to-peer systems asynchronous systems
88	An?lise dos indicadores de qualidade versus taxa de abandono utilizando m?todo de regress?o m?ltipla para servi?o de banda larga Fernandes Neto, Andr? Pedro 20 June 2008 (has links) Made available in DSpace on 2014-12-17T14:52:36Z (GMT). No. of bitstreams: 1 AndrePFN.pdf: 1525936 bytes, checksum: edb576494fd35f42e78d512df4fc02df (MD5) Previous issue date: 2008-06-20 / Telecommunication is one of the most dynamic and strategic areas in the world. Many technological innovations has modified the way information is exchanged. Information and knowledge are now shared in networks. Broadband Internet is the new way of sharing contents and information. This dissertation deals with performance indicators related to maintenance services of telecommunications networks and uses models of multivariate regression to estimate churn, which is the loss of customers to other companies. In a competitive environment, telecommunications companies have devised strategies to minimize the loss of customers. Loosing customers presents a higher cost than obtaining new ones. Corporations have plenty of data stored in a diversity of databases. Usually the data are not explored properly. This work uses the Knowledge Discovery in Databases (KDD) to establish rules and new models to explain how churn, as a dependent variable, are related to a diversity of service indicators, such as time to deploy the service (in hours), time to repair (in hours), and so on. Extraction of meaningful knowledge is, in many cases, a challenge. Models were tested and statistically analyzed. The work also shows results that allows the analysis and identification of which quality services indicators influence the churn. Actions are also proposed to solve, at least in part, this problem / A ?rea de telecomunica??es ? uma das mais estrat?gicas e din?micas do mundo atual. Esse fato se deve a in?meras inova??es tecnol?gicas que afetaram a forma como as informa??es trafegam. O conhecimento deixou de ser percebido como um ac?mulo linear, l?gico e cronol?gico de informa??es e passou a ser visto como uma constru??o em rede, consequentemente a massifica??o da Internet banda larga em alta velocidade teve grande influ?ncia sobre esse fen?meno. Essa disserta??o aborda um estudo sobre medi??o de desempenho e servi?os de manuten??o em telecomunica??es, com o uso de ferramentas de descoberta de conhecimento em base de dados (KDD). Objetiva-se transformar informa??es, armazenadas nas bases de dados de uma grande empresa de telecomunica??es do pa?s, em conhecimento ?til. A metodologia de pesquisa utilizada focou no uso de an?lise de regress?o m?ltipla como ferramenta para estimar a taxa de abandono de clientes em servi?os de Internet de banda larga, como vari?vel dependente, e indicadores de qualidade de servi?o como vari?veis independentes. Modelos foram testados e analisados estatisticamente. O trabalho apresenta resultados que permitem analisar e identificar quais os indicadores de qualidade que exercem maior influ?ncia na taxa de abandono dos clientes. S?o propostas sugest?es que possam ser aplicadas para melhoria de qualidade do servi?o percebido e consequentemente diminui??es das perdas com a taxa de abandono &#65279 Medi??o de Desempenho Qualidade de Servi?o Taxa de Abandono (Churn) An?lise de Regress?o Multivariada Performance measurement Quality of Services Knowledge Discovery in Databases Churn Models of Multivariate Regression Decision Support Systems
89	Customer churn prediction in a slow fashion e-commerce context : An analysis of the effect of static data in customer churn prediction Colasanti, Luca January 2023 (has links) Survival analysis is a subfield of statistics where the goal is to analyse and model the data where the outcome is the time until the occurrence of an event of interest. Because of the intrinsic temporal nature of the analysis, the employment of more recently developed sequential models (Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM)) has been paired with the use of dynamic temporal features, in contrast with the past reliance on static ones. Such an abrupt shift of policy has left open the challenge of understanding how those two kinds of features influence the predictive capabilities of models. This thesis aims at assessing the effect of combining static and dynamic features on the most commonly used models in survival analysis. In doing so, we compare the error measurements of such models with dataset composed of purely dynamic features or a combination of static and dynamic ones. Empirical measurements have shown that models respond differently to the addition of static features to the analysis, with more complex, sequential models like the LSTM struggling to deal with the added data complexity (with a 12% increase in error), while non sequential models see reductions of up to 14.7% in error. The thesis also includes a clusterization task aimed at aiding the interpretation of survival analysis outcomes. / Överlevnadsanalys är ett delområde inom statistiken där målet är att analysera och modellera data där utfallet är tiden fram till dess att en händelse av intresse inträffar. På grund av analysens inneboende tidsmässiga karaktär har användningen av mer nyligen utvecklade sekventiella modeller (RNN och LSTM) kombinerats med användningen av dynamiska tidsmässiga egenskaper, i motsats till den tidigare förlitningen på statiska sådana. En sådan drastisk förändring av ansatsen har lämnat öppet för utmaningen att förstå hur dessa två typer av egenskaper påverkar modellernas förutsägande förmåga. Syftet med denna uppsats är att bedöma effekten av att kombinera statiska och dynamiska egenskaper på de vanligaste modellerna för överlevnadsanalys. I detta syfte jämför vi felmätningar av sådana modeller med dataset som består av rent dynamiska egenskaper eller en kombination av statiska och dynamiska egenskaper. Empiriska mätningar har visat att modellerna reagerar olika på tillägget av statiska egenskaper till analysen, där mer komplexa, sekventiella modeller som LSTM kämpar för att hantera den ökade datakomplexiteten (med en ökning av felet med 12 %), medan icke-sekventiella modeller ser en minskning av felet med upp till 14,7 %. Uppsatsen innehåller också en klusteruppgift som syftar till att underlätta tolkningen av resultaten av överlevnadsanalyser. / L’analisi della sopravvivenza è una branca della statistica il cui obiettivo è l’analisi e la modellazione di dati il cui risultato è il tempo che intercorre fino al verificarsi di un evento di interesse. A causa dell’intrinseca natura temporale dell’analisi, l’impiego di modelli sequenziali di più recente sviluppo (RNN e LSTM) è stato abbinato all’uso di attributi temporali dinamici, a differenza dell’uso più diffuso in passato di attributi statici. Questo brusco cambiamento ha lasciato aperta la sfida di capire come questi due tipi di attributi influenzino le capacità predittive dei modelli. Questa tesi si propone di valutare l’effetto della combinazione di attributi statici e dinamici sui modelli più comunemente utilizzati nell’analisi della sopravvivenza. A tal fine, confrontiamo le misure di errore di tali modelli con set di dati composti da attributi puramente dinamici o da una combinazione di statici e dinamici. I risultati empirici hanno mostrato che i modelli rispondono in modo diverso all’aggiunta di attrbiuti statici, con i modelli sequenziali più complessi, come l’LSTM, che faticano a gestire la complessità dei dati aggiunti (con un aumento dell’errore del 12%), mentre i modelli non sequenziali registrano riduzioni dell’errore fino al 14,7%. La tesi comprende anche una clusterizzazione volta a facilitare l’interpretazione dei risultati dell’analisi di sopravvivenza. Survival Analysis Time To Event prediction Churn retention Machine Learning Deep Learning Customer Clustering E-commerce Analisi di sopravvivenza Previsione del tempo a evento Ritenzione dall’abbandono dei clienti Apprendimento automatico Apprendimento profondo Segmentazione della clientela Commercio elettronico Överlevnadsanalys Tid till händelseförutsägelse Churn Prediction Maskininlärning Djuplärning Kundkluster E-handel Computer and Information Sciences Data- och informationsvetenskap
90	Cambiamento organizzativo e modificazione del network / ORGANIZATIONAL CHANGE AND PATTERN OF NETWORK CHURN GIORGIO, LUCA 01 April 2019 (has links) La tesi ha l’obiettivo di analizzare il cambiamento organizzativo in una prospettiva di social network analysis, sfruttando dati longitudinali raccolti a seguito della modifica della struttura organizzativa in un Policlinico Universitario italiano. Il manoscritto è organizzativo in tre paper. Il primo paper si focalizza sul tema del rapporto tra network formali e network informali, analizzando come la modifica del primo comporti una corrispondente variazione nel secondo. Il paper dimostra come, in assenza di strutture organizzative ben formalizzate, gli individui tendono ad allacciare nuovi legami con colleghi che appartengono alla stessa specializzazione. Il secondo paper, invece, attingendo prettamente alla letteratura di comportamento organizzativo, analizza il tema della dinamicità del network, fornendo evidenze in relazione alla stabilità del network stesso a seguito del cambiamento. Particolare attenzione, è inoltre, dedicata alle dinamiche intra – team e al ruolo di quest’ultime nell’accettazione o meno del cambiamento. Infine, il terzo paper sviluppa il tema della network density e di come quest’ultima possa essere correlato al cambiamento organizzativo, in termini di reazione al cambiamento. Inoltre, si dimostra come la formalizzazione abbia un impatto positivo sulla densità del network, specie in contesti organizzativi caratterizzati da una bassa gerarchia e coordinamento orizzontale. / This thesis aims to analyze organizational change in a social network analysis perspective, exploiting longitudinal data collected after a modification of the organizational structure in an Italian Teaching Hospital The manuscript is organized into three papers. The first paper focuses on the theme of the relationship between formal networks and informal networks, analyzing how the modification of the first involves a corresponding variation in the second. The paper demonstrates how, in the absence of formalized organizational structures, individuals tend to establish new ties with colleagues who belong to the same specialization. The second paper, drawing purely from the organizational behavior literature, analyzes the issue of the network dynamics , providing evidence and antecedents for network stability in response to organizational change. Particular attention is also given to the intra - team dynamics and the impact of individual perception of collective properties in driving employees in accepting or not the organizational change. Finally, the third paper develops the theme of network density and how the latter can be related to organizational change, in terms of reaction to change. Furthermore, it is shown how formalization has a positive impact on network density, especially in organizational contexts characterized by a low hierarchy and horizontal coordination. SECS-P/07: ECONOMIA AZIENDALE SECS-P/10: ORGANIZZAZIONE AZIENDALE

Search results