31 |
Differences and similarities in work absence behavior : - empirical evidence from micro dataNilsson, Maria January 2005 (has links)
This thesis consists of three self-contained essays about absenteeism. Essay I analyzes if the design of the insurance system affects work absence, i.e. the classic insurance problem of moral hazard. Several reforms of the sickness insurance system were implemented during the period 1991-1996. Using Negative binomial models with fixed effects, the analysis show that both workers and employers changed their behavior due to the reforms. We also find that the extent of moral hazard varies depending on work contract structures. The reforms reducing the compensation levels decreased workers’ absence, both the number of absent days and the number of absence spells. The reform in 1992, introducing sick pay paid by the employers, also decreased absence levels, which probably can be explained by changes in personnel policy such as increased use of monitoring and screening of workers. Essay II examines the background to gender differences in work absence. Women are found, as in many earlier studies, to have higher absence levels than men. Our analysis, using finite mixture models, reveals that there are a group of women, comprised of about 41% of the women in our sample, that have a high average demand of absence. Among men, the high demand group is smaller consisting of about 36% of the male sample. The absence behavior differs as much between groups within gender as it does between men and women. The access to panel data covering the period 1971-1991 enables an analysis of the increased gender gap over time. Our analysis shows that the increased gender gap can be attributed to changes in behavior rather than in observable characteristics. Essay III analyzes the difference in work absence between natives and immigrants. Immigrants are found to have higher absence than natives when measured as the number of absent days. For the number of absence spells, the pattern for immigrants and natives is about the same. The analysis, using panel data and count data models, show that natives and immigrants have different characteristics concerning family situation, work conditions and health. We also find that natives and immigrants respond differently to these characteristics. We find, for example, that the absence of natives and immigrants are differently related to both economic incentives and work environment. Finally, our analysis shows that differences in work conditions and work environment only can explain a minor part of the ethnic differences in absence during the 1980’s.
|
32 |
Bootstrap for panel data models with an application to the evaluation of public policiesHounkannounon, Bertrand G. B. 08 1900 (has links)
Le but de cette thèse est d étendre la théorie du bootstrap aux modèles de données de panel. Les données de panel s obtiennent en observant plusieurs unités statistiques sur plusieurs périodes de temps. Leur double dimension individuelle et temporelle permet de contrôler l 'hétérogénéité non observable entre individus et entre les périodes de temps et donc de faire des études plus riches que les séries chronologiques ou les données en coupe instantanée. L 'avantage du bootstrap est de permettre d obtenir une inférence plus précise que celle avec la théorie asymptotique classique ou une inférence impossible en cas de paramètre de nuisance. La méthode consiste à tirer des échantillons
aléatoires qui ressemblent le plus possible à l échantillon d analyse. L 'objet statitstique d intérêt est estimé sur chacun de ses échantillons aléatoires et on
utilise l ensemble des valeurs estimées pour faire de l inférence. Il existe dans la littérature certaines application du bootstrap aux données de panels sans
justi cation théorique rigoureuse ou sous de fortes hypothèses. Cette thèse propose une méthode de bootstrap plus appropriée aux données de panels. Les trois chapitres analysent sa validité et son application.
Le premier chapitre postule un modèle simple avec un seul paramètre et s 'attaque aux propriétés théoriques de l estimateur de la moyenne. Nous montrons que le double rééchantillonnage que nous proposons et qui tient compte à la fois de la dimension individuelle et la dimension temporelle est
valide avec ces modèles. Le rééchantillonnage seulement dans la dimension
individuelle n est pas valide en présence d hétérogénéité temporelle. Le ré-échantillonnage dans la dimension temporelle n est pas valide en présence d'hétérogénéité individuelle.
Le deuxième chapitre étend le précédent au modèle panel de régression. linéaire. Trois types de régresseurs sont considérés : les caractéristiques individuelles, les caractéristiques temporelles et les régresseurs qui évoluent dans le temps et par individu. En utilisant un modèle à erreurs composées doubles, l'estimateur des moindres carrés ordinaires et la méthode de bootstrap des résidus, on montre que le rééchantillonnage dans la seule dimension individuelle est valide pour l'inférence sur les coe¢ cients associés aux régresseurs qui changent uniquement par individu. Le rééchantillonnage dans la dimen-
sion temporelle est valide seulement pour le sous vecteur des paramètres associés aux régresseurs qui évoluent uniquement dans le temps. Le double rééchantillonnage est quand à lui est valide pour faire de l inférence pour tout le vecteur des paramètres.
Le troisième chapitre re-examine l exercice de l estimateur de différence
en di¤érence de Bertrand, Duflo et Mullainathan (2004). Cet estimateur est
couramment utilisé dans la littérature pour évaluer l impact de certaines poli-
tiques publiques. L exercice empirique utilise des données de panel provenant
du Current Population Survey sur le salaire des femmes dans les 50 états des
Etats-Unis d Amérique de 1979 à 1999. Des variables de pseudo-interventions
publiques au niveau des états sont générées et on s attend à ce que les tests
arrivent à la conclusion qu il n y a pas d e¤et de ces politiques placebos sur
le salaire des femmes. Bertrand, Du o et Mullainathan (2004) montre que la non-prise en compte de l hétérogénéité et de la dépendance temporelle entraîne d importantes distorsions de niveau de test lorsqu'on évalue l'impact de politiques publiques en utilisant des données de panel. Une des solutions préconisées est d utiliser la méthode de bootstrap. La méthode de double ré-échantillonnage développée dans cette thèse permet de corriger le problème de niveau de test et donc d'évaluer correctement l'impact des politiques publiques. / The purpose of this thesis is to develop bootstrap methods for panel data models and to prove their validity. Panel data refers to data sets where observations on individual units (such as households, firms or countries) are available over several time periods. The availability of two dimensions (cross-section and time series) allows for the identi cation of effects that could not be accounted for otherwise. In this thesis, we explore the use of the bootstrap to obtain estimates of the distribution of statistics that are more accurate than the usual asymptotic theory. The method consists in drawing many ran-
dom samples that resembles the sample as much as possible and estimating
the distribution of the object of interest over these random samples. It has been shown, both theoretically and in simulations, that in many instances,this approach improves on asymptotic approximations. In other words, the
resulting tests have a rejection rate close to the nominal size under the null hypothesis and the resulting con dence intervals have a probability of inclu-
ding the true value of the parameter that is close to the desired level.
In the literature, there are many applications of the bootstrap with panel
data, but these methods are carried out without rigorous theoretical justi fication. This thesis suggests a bootstrap method that is suited to panel data (which we call double resampling), analyzes its validity, and implements it in the analysis of treatment e¤ects. The aim is to provide a method that will provide reliable inference without having to make strong assumptions on the underlying data-generating process.
The rst chapter considers a model with a single parameter (the overall expectation) with the sample mean as estimator. We show that our double resampling is valid for panel data models with some cross section and/or temporal heterogeneity. The assumptions made include one-way and two-
way error component models as well as factor models that have become popular with large panels. On the other hand, alternative methods such as bootstrapping cross-sections or blocks in the time dimensions are only valid under some of these models.
The second chapter extends the previous one to the panel linear regression model. Three kinds of regressors are considered : individual characteristics, temporal characteristics and regressors varying across periods and cross-sectional units. We show that our double resampling is valid for inference about all the coe¢ cients in the model estimated by ordinary least squares under general types of time-series and cross-sectional dependence. Again, we show that other bootstrap methods are only valid under more restrictive conditions.
Finally, the third chapter re-examines the analysis of di¤erences-in-differences
estimators by Bertrand, Du o and Mullainathan (2004). Their empirical application uses panel data from the Current Population Survey on wages of women in the 50 states. Placebo laws are generated at the state level, and the authors measure their impact on wages. By construction, no impact should
be found. Bertrand, Dufl o and Mullainathan (2004) show that neglected heterogeneity and temporal correlation lead to spurious ndings of an effect of the Placebo laws. The double resampling method developed in this thesis corrects these size distortions very well and gives more reliable evaluation of public policies.
|
33 |
Concept-Oriented Model and Nested Partially Ordered SetsSavinov, Alexandr 24 April 2014 (has links) (PDF)
Concept-oriented model of data (COM) has been recently defined syntactically by means of the concept-oriented query language (COQL). In this paper we propose a formal embodiment of this model, called nested partially ordered sets (nested posets), and demonstrate how it is connected with its syntactic counterpart. Nested poset is a novel formal construct that can be viewed either as a nested set with partial order relation established on its elements or as a conventional poset where elements can themselves be posets. An element of a nested poset is defined as a couple consisting of one identity tuple and one entity tuple. We formally define main operations on nested posets and demonstrate their usefulness in solving typical data management and analysis tasks such as logic navigation, constraint propagation, inference and multidimensional analysis.
|
34 |
A modelagem de documentos estruturados multitmídia integrando sistemas de hipertextos e ODA/ODIF / The modeling of multimedia structured documents integrating hypertexts systems and ODA/ODIFPerez, Celso Roberto January 1994 (has links)
Há um grande numero de aplicações que necessitam manipular documentos. Tal manipulação exige uma gerência dos mesmos nas tarefas de criação, armazenamento, recuperação e transmissão. Estas tarefas devem levar em conta características e aspectos inerentes aos documentos tais como estrutura lógica, estrutura de apresentação e hiperestrutura formada pelas referências internas e externas existentes nos documentos envolvidos. A multimídia estabelece novos requisitos para os sistemas de gerência de documentos estruturados. Gráficos, sons, e imagens contêm informações que enriquecem o conteúdo textual tradicional dos documentos, podendo ser potencialmente explorados pelos usuários em um processo de formulação de consultas e busca de documentos. A complexidade das aplicações que manipulam documentos estruturados e do tipo multimídia exige o apoio de modelos capazes de expressar características semanticamente mais ricas. Desta forma, tais modelos devem permitir modelar as seguintes estruturas: lógica, de apresentação e hiperestrutura. A adoção de um modelo conceitual de documentos é fator determinante nas possibilidades oferecidas para consultas e recuperação dos referidos documentos. Para a especificação e definição de tal modelo, foram consideradas duas possibilidades: i) utilização dos sistemas de hipertextos, nos quais a ênfase é fornecer um corpo estruturado de objetos com ligações conectando objetos relacionados. Aqui, a estrutura está projetada especificamente para auxiliar aos leitores a navegar através da informação, ii) uso do padrão de documentos eletrônicos ODA/ODIF, o qual dá ênfase à composição e controle da forma dos documentos, preocupando-se, também, com a divisão layout-estrutura-conteúdo dos mesmos, mas faltando nele um tratamento específico em relação às características hipertextuais. Neste trabalho se considera que a integração destas duas filosofias permitirá, de uma maneira natural, a modelagem de Documentos Estruturados Multimídia. As pesquisas e propostas para combinar estas duas opções foram escassas no passado e se desconhecem trabalhos deste tipo visando a Gerência de Documentos Estruturados Multimídia. Como resultado desta integração, no presente trabalho é definido e especificado o meta-modelo OHypA (Office HyperDocument Architecture), que pode ser considerado como uma extensão do padrão ODA/ODIF. Tal meta-modelo terá uma aplicação real e prática pela combinação da tecnologia de hipermídia e a representação de documentos do tipo ODA. Assim sendo, uma vez que o presente trabalho trata a modelagem de Documentos Estruturados Multimídia , envolvendo as áreas definidas, ele permitirá o estudo e delineamento de possíveis soluções para problemas comuns resultantes da integração das referidas áreas. Finalmente, serão integrados dois enfoques em pleno desenvolvimento, resultando num meta-modelo orientado a objetos, que será facilmente integrado a um Sistema de Banco de Dados Orientado a Objetos. / There it a great number of applications that need a document manipulation. Such manipulation demands a management of these documents on tasks like creation, storage, recovery and transmission. Such tasks should consider marks and aspects inherent on documents such as logical structure, presentation structure and hyperstructure, shaped by internal and external references existent in the documents involved. Multimedia establishes new requirements for management systems of structured documents. Graphics, sounds and images contain informations which increase the traditional textual content from the documents that might be potentially explored by user in a process of formulation- queries and search for documents. The complexity of the applications that manipulate structured documents and the kind of multimedia, demands support from models which are able to express characteristics semanually ricer. This way, such models must allow the modeling of the following structures: luzical, presentation and hyperstructure. The adoption of a respected model of documents is a main factor on the offered possibilities for queries and recovering of the reported documents. For spt-Afication and definition of such model, two possibilities were considered: i) utilization of hypertexts systems, in which the emphasis is to provide a structured body of objets connecting other objects related. Here, the structure is designed specificity to help re.ers to navigate through the information, ii) application of the electronic documents model ODA/ODIF, which emphasizes the composition and control of the documents shape, worries also about its sharing of layout-structure-content, but lacks an specific treatment in relation to hypertextual characteristics. This work considers that the integration of these two philosophies will allow, in a natural way, the modeling of multimedia structured documents. The researches and proposals to connect these two options used to be rare in the past. Besides, works that emphasizes the Management of Multimedia Structured Documents are unknown. As a result of such integration, this work describes precisely the OHypA metamodel (Office HyperDocument Architecture) which might be consider as an extension of the ODA/ODIF model. This metamodel will have a real and practical application with the connection of hypermedia technology and the representation of ODA documents. This way, since this work is about the modeling of Mul timedia Structured Documents, covering the described areas, it'll allows the study and outline of possible solutions to common problems that come from the integration of such areas. Finally, two topics in development will be integrated, resulting in a objectoriented metamodel, that should be easily integrated in a Object-Oriented Data Base System.
|
35 |
A modelagem de documentos estruturados multitmídia integrando sistemas de hipertextos e ODA/ODIF / The modeling of multimedia structured documents integrating hypertexts systems and ODA/ODIFPerez, Celso Roberto January 1994 (has links)
Há um grande numero de aplicações que necessitam manipular documentos. Tal manipulação exige uma gerência dos mesmos nas tarefas de criação, armazenamento, recuperação e transmissão. Estas tarefas devem levar em conta características e aspectos inerentes aos documentos tais como estrutura lógica, estrutura de apresentação e hiperestrutura formada pelas referências internas e externas existentes nos documentos envolvidos. A multimídia estabelece novos requisitos para os sistemas de gerência de documentos estruturados. Gráficos, sons, e imagens contêm informações que enriquecem o conteúdo textual tradicional dos documentos, podendo ser potencialmente explorados pelos usuários em um processo de formulação de consultas e busca de documentos. A complexidade das aplicações que manipulam documentos estruturados e do tipo multimídia exige o apoio de modelos capazes de expressar características semanticamente mais ricas. Desta forma, tais modelos devem permitir modelar as seguintes estruturas: lógica, de apresentação e hiperestrutura. A adoção de um modelo conceitual de documentos é fator determinante nas possibilidades oferecidas para consultas e recuperação dos referidos documentos. Para a especificação e definição de tal modelo, foram consideradas duas possibilidades: i) utilização dos sistemas de hipertextos, nos quais a ênfase é fornecer um corpo estruturado de objetos com ligações conectando objetos relacionados. Aqui, a estrutura está projetada especificamente para auxiliar aos leitores a navegar através da informação, ii) uso do padrão de documentos eletrônicos ODA/ODIF, o qual dá ênfase à composição e controle da forma dos documentos, preocupando-se, também, com a divisão layout-estrutura-conteúdo dos mesmos, mas faltando nele um tratamento específico em relação às características hipertextuais. Neste trabalho se considera que a integração destas duas filosofias permitirá, de uma maneira natural, a modelagem de Documentos Estruturados Multimídia. As pesquisas e propostas para combinar estas duas opções foram escassas no passado e se desconhecem trabalhos deste tipo visando a Gerência de Documentos Estruturados Multimídia. Como resultado desta integração, no presente trabalho é definido e especificado o meta-modelo OHypA (Office HyperDocument Architecture), que pode ser considerado como uma extensão do padrão ODA/ODIF. Tal meta-modelo terá uma aplicação real e prática pela combinação da tecnologia de hipermídia e a representação de documentos do tipo ODA. Assim sendo, uma vez que o presente trabalho trata a modelagem de Documentos Estruturados Multimídia , envolvendo as áreas definidas, ele permitirá o estudo e delineamento de possíveis soluções para problemas comuns resultantes da integração das referidas áreas. Finalmente, serão integrados dois enfoques em pleno desenvolvimento, resultando num meta-modelo orientado a objetos, que será facilmente integrado a um Sistema de Banco de Dados Orientado a Objetos. / There it a great number of applications that need a document manipulation. Such manipulation demands a management of these documents on tasks like creation, storage, recovery and transmission. Such tasks should consider marks and aspects inherent on documents such as logical structure, presentation structure and hyperstructure, shaped by internal and external references existent in the documents involved. Multimedia establishes new requirements for management systems of structured documents. Graphics, sounds and images contain informations which increase the traditional textual content from the documents that might be potentially explored by user in a process of formulation- queries and search for documents. The complexity of the applications that manipulate structured documents and the kind of multimedia, demands support from models which are able to express characteristics semanually ricer. This way, such models must allow the modeling of the following structures: luzical, presentation and hyperstructure. The adoption of a respected model of documents is a main factor on the offered possibilities for queries and recovering of the reported documents. For spt-Afication and definition of such model, two possibilities were considered: i) utilization of hypertexts systems, in which the emphasis is to provide a structured body of objets connecting other objects related. Here, the structure is designed specificity to help re.ers to navigate through the information, ii) application of the electronic documents model ODA/ODIF, which emphasizes the composition and control of the documents shape, worries also about its sharing of layout-structure-content, but lacks an specific treatment in relation to hypertextual characteristics. This work considers that the integration of these two philosophies will allow, in a natural way, the modeling of multimedia structured documents. The researches and proposals to connect these two options used to be rare in the past. Besides, works that emphasizes the Management of Multimedia Structured Documents are unknown. As a result of such integration, this work describes precisely the OHypA metamodel (Office HyperDocument Architecture) which might be consider as an extension of the ODA/ODIF model. This metamodel will have a real and practical application with the connection of hypermedia technology and the representation of ODA documents. This way, since this work is about the modeling of Mul timedia Structured Documents, covering the described areas, it'll allows the study and outline of possible solutions to common problems that come from the integration of such areas. Finally, two topics in development will be integrated, resulting in a objectoriented metamodel, that should be easily integrated in a Object-Oriented Data Base System.
|
36 |
Essays in international finance and bankingNahhas, Abdulkader January 2016 (has links)
In this thesis financial movements are considered in terms of foreign direct investment (FDI) and a related way to international banking. In Chapter 2 FDI is analysed in terms of the major G7 economies. Then this is further handled in Chapter 3 in terms of bilateral FDI (BFDI) data related to a broader group of economies and a main mode of analysis the Gravity model. Gravity models are then used in Chapter 4 to analyse bilateral cross border lending in a similar way. While the exchange rate effect is handled in terms of volatility and measured using models of conditional variance. The analysis focused on the bilateral data pays attention to the breakdown of crises across the whole period. With further consideration made of the Euro zone in terms of the study of BFDI and cross border lending. The initial study looks at the determinants of the inflow and outflow of stocks of FDI in the G7 economies for the period 1980-2011. A number of factors, such as research and development (R&D), openness and relative costs are shown to be important, but the main focus is on the impact of the real and nominal effective exchange rate volatility. Where nominal and real exchange rate volatility are measured using a model of generalised autoregressive conditional heteroscedasticity (GARCH) to explain the variance. Although the impact of volatility is theoretically ambiguous inflows are generally negatively affected by increased volatility, whilst there is some evidence outflows increase when volatility rises. In Chapter 3, the effect of bilateral exchange rate volatility is analysed using BFDI stocks, from 14 high income countries to all the OECD countries over the period 1995-2012. This is done using annual panel data with a gravity model. The empirical analysis applies the generalised method of moments (GMM) estimator to a gravity model of BFDI stocks. The findings imply that exports, GDP and distance are key variables that follow from the Gravity model. This study considers the East Asian, global financial markets and systemic banking crises have exerted an impact on BFDI. These effects vary by the type and origin of the crisis, but are generally negative. A high degree of exchange rate volatility discourages BFDI. Chapter 4 considers the determinants of cross-border banking activity from 19 advanced countries to the European Union (EU) over the period 1999-2014. Bilateral country-level stock data on cross-border lending is examined. The data allows us to analyse the effect of financial crises – differentiated by type: systemic banking crises, the global financial crisis, the Euro debt crisis and the Lehman Brothers crisis on the geography of cross-border lending. The problem is analysed using quarterly panel data with a Gravity model. The empirical "Gravity" model conditioned on distance and size measured by GDP is a benchmark in explaining the volume of cross border banking activities. In addition to the investigation of the impact of crises further comparison is made by investigating the impact of European integration on cross-border banking activities between member states. These results are robust to various econometric methodologies, samples, and institutional characteristics.
|
37 |
Rozpočty obcí v ČR – ekonometrická analýza s využitím panelových dát / Municipal budgets in Czech Republic – econometric panel data analysisZvariková, Alexandra January 2017 (has links)
This paper analyses a panel data of 198 Czech municipalities for the period 2003-2015. The aim is to define determinants of municipalities' tax revenue budgeting errors using static panel data models with fixed and random effect. Czech municipalities have a tendency to underestimate both total and tax revenues. On average, budgeted tax revenues are about 7 % lower than collected revenues during the period under examination. Such action could entail less transparency in budgeting process. Results indicate that structure of tax revenues also plays a role in explaining forecast errors. Further, the analysis shows the impact of electoral cycle and macroeconomic variables on budget deviations.
|
38 |
Concept-Oriented Model and Nested Partially Ordered SetsSavinov, Alexandr 24 April 2014 (has links)
Concept-oriented model of data (COM) has been recently defined syntactically by means of the concept-oriented query language (COQL). In this paper we propose a formal embodiment of this model, called nested partially ordered sets (nested posets), and demonstrate how it is connected with its syntactic counterpart. Nested poset is a novel formal construct that can be viewed either as a nested set with partial order relation established on its elements or as a conventional poset where elements can themselves be posets. An element of a nested poset is defined as a couple consisting of one identity tuple and one entity tuple. We formally define main operations on nested posets and demonstrate their usefulness in solving typical data management and analysis tasks such as logic navigation, constraint propagation, inference and multidimensional analysis.
|
39 |
Unleashing Profitability: Unraveling the Labor-R&D Nexus in SaaS Tech Firms : An Analysis of the Profitability Dynamics in SaaS Tech Firms through Stochastic FrontierAtla, prashant, Salman, Noräs January 2023 (has links)
Background: High-tech's rapid growth and prioritization of expansion over profitability can lead to vulnerability in economic downturns. The SaaS market, a part of the high-tech industry, offers affordable and flexible software solutions but is also susceptible to market volatility. To succeed, SaaS startups must strike a balance between growth and profitability. Stochastic frontier analysis can measure technical efficiency and productivity in the SaaS market, offering insights into resource and labor utilization. We present an empirical study that explores factors that influence a firm's profitability, aiming to inform decision-making for SaaS companies. Purpose: Our academic work is centered around gaining a comprehensive understanding of the Software-as-a-Service (SaaS) market and the role of labor and research and development expenses toexplore these factors and their influence on a firm's profitability. This study seeks to address this gap in knowledge by conducting an empirical analysis to examine the technical efficiency distribution among SaaS firms, with the aim of gaining insights into resource and labor utilization. By analyzing technical efficiency distribution among SaaS firms, the study will provide insights into resource and labor utilization and its effect on profitability. The research questions will focus on the relationship between technical efficiency, labor utilization, and production functions on profitability. Methodology: We utilized Model I - Cobb Douglas Panel Data Regression with Fixed Effects, Model II - Cobb Douglas Panel Data Stochastic Frontier Analysis using the Kumbhakar and Lovell (1990), and Model III - Transcendental Logarithmic Panel Data Cobb Douglas Stochastic Frontier Analysis using the Kumbhakar and Lovell (1990). These models allowed us to measure the technical efficiency of SaaS firms and examine the interplay between various variables, such as employee count and R&D expenseswith liabilities and assets as control variables. Results and analysis: The three models revealed that labor, assets, and R&D expenses positively and significantly affect profitability in SaaS firms. The SaaS industry also exhibits decreasing returns to scale in two models, suggesting that increasing all inputs proportionally leads to a less-than-proportional increase in output with the third model exhibiting an increasing return to scale. Also, top performers in technical efficiency tend to have higher marginal product of labor (MPL) values than bottom performers.Conclusions: Technical efficiency is positively correlated with profitability, indicating that more efficient SaaS firms achieve higher profitability levels. The relationship between technical efficiency and profitability is stronger when using the Translog model compared to the Cobb-Douglas model. The study also found that the factors contributing most to profitability in SaaS firms are the number of employees and assets, followed by research and development expenses. Recommendations for future research: Further studies could explore the extent to which factors such as the quality of the workforce, technology, and business processes impact MPL and technical efficiency in SaaS firms. Additionally, future research could investigate the effects of market competition, firm size, and industry regulation on profitability in the SaaS industry. Finally, research could investigate the potential benefits of diversifying investment portfolios to include SaaS stocks, given the significant impact of labor, assets, and R&D expenses on profitability.
|
40 |
EFFICIENT LSM SECONDARY INDEXING FOR UPDATE-INTENSIVE WORKLOADSJaewoo Shin (17069089) 29 September 2023 (has links)
<p dir="ltr">In recent years, massive amounts of data have been generated from various types of devices or services. For these data, update-intensive workloads where the data update their status periodically and continuously are common. The Log-Structured-Merge (LSM, for short) is a widely-used indexing technique in various systems, where index structures buffer insert operations into the memory layer and flush them into disk when the data size in memory exceeds a threshold. Despite its noble ability to handle write-intensive (i.e., insert-intensive) workloads, LSM suffers from degraded query performance due to its inefficiency on index maintenance of secondary keys to handle update-intensive workloads.</p><p dir="ltr">This dissertation focuses on the efficient support of update-intensive workloads for LSM-based indexes. First, the focus is on the optimization of LSM secondary-key indexes and their support for update-intensive workloads. A mechanism to enable the LSM R-tree to handle update-intensive workloads efficiently is introduced. The new LSM indexing structure is termed the LSM RUM-tree, an LSM R-tree with Update Memo. The key insights are to reduce the maintenance cost of the LSM R-tree by leveraging an additional in-memory memo structure to control the size of the memo to fit in memory. In the experiments, the LSM RUM-tree achieves up to 9.6x speedup on update operations and up to 2400x speedup on query operations.</p><p dir="ltr">Second, the focus is to offer several significant advancements in the context of the LSM RUM-tree. We provide an extended examination of LSM-aware Update Memo (UM) cleaning strategies, elucidating how effectively each strategy reduces UM size and contributes to performance enhancements. Moreover, in recognition of the imperative need to facilitate concurrent activities within the LSM RUM-Tree, particularly in multi-threaded/multi-core environments, we introduce a pivotal feature of concurrency control for the update memo. The novel atomic operation known as Compare and If Less than Swap (CILS) is introduced to enable seamless concurrent operations on the Update Memo. Experimental results attest to a notable 4.5x improvement in the speed of concurrent update operations when compared to existing and baseline implementations.</p><p dir="ltr">Finally, we present a novel technique designed to improve query processing performance and optimize storage management in any secondary LSM tree. Our proposed approach introduces a new framework and mechanisms aimed at addressing the specific challenges associated with secondary indexing in the structure of the LSM tree, especially in the context of secondary LSM B+-tree (LSM BUM-tree). Experimental results show that the LSM BUM-tree achieves up to 5.1x speedup on update-intensive workloads and 107x speedup on update and query mixed workloads over existing LSM B+-tree implementations.</p>
|
Page generated in 0.3654 seconds