Spelling suggestions: "subject:"index construction"" "subject:"índex construction""
1 |
Efficient Index Maintenance for Text DatabasesLester, Nicholas, nml@cs.rmit.edu.au January 2006 (has links)
All practical text search systems use inverted indexes to quickly resolve user queries. Offline index construction algorithms, where queries are not accepted during construction, have been the subject of much prior research. As a result, current techniques can invert virtually unlimited amounts of text in limited main memory, making efficient use of both time and disk space. However, these algorithms assume that the collection does not change during the use of the index. This thesis examines the task of index maintenance, the problem of adapting an inverted index to reflect changes in the collection it describes. Existing approaches to index maintenance are discussed, including proposed optimisations. We present analysis and empirical evidence suggesting that existing maintenance algorithms either scale poorly to large collections, or significantly degrade query resolution speed. In addition, we propose a new strategy for index maintenance that trades a strictly controlled amount of querying efficiency for greatly increased maintenance speed and scalability. Analysis and empirical results are presented that show that this new algorithm is a useful trade-off between indexing and querying efficiency. In scenarios described in Chapter 7, the use of the new maintenance algorithm reduces the time required to construct an index to under one sixth of the time taken by algorithms that maintain contiguous inverted lists. In addition to work on index maintenance, we present a new technique for accumulator pruning during ranked query evaluation, as well as providing evidence that existing approaches are unsatisfactory for collections of large size. Accumulator pruning is a key problem in both querying efficiency and overall text search system efficiency. Existing approaches either fail to bound the memory footprint required for query evaluation, or suffer loss of retrieval accuracy. In contrast, the new pruning algorithm can be used to limit the memory footprint of ranked query evaluation, and in our experiments gives retrieval accuracy not worse than previous alternatives. The results presented in this thesis are validated with robust experiments, which utilise collections of significant size, containing real data, and tested using appropriate numbers of real queries. The techniques presented in this thesis allow information retrieval applications to efficiently index and search changing collections, a task that has been historically problematic.
|
2 |
Constructing a sophistication index as a method of market segmentation of commercial farming businesses in South AfricaVan Zyl, H.J.D. (Hendrik Jacobus Dion) 30 April 2013 (has links)
This study investigates the process of index construction as a means of measuring a hypothetical construct that can typically not be measured by a single question or item in a survey study and applying it as a method of market segmentation. The availability of incidental secondary data that were gathered during 2009 provides a relevant quantitative basis to illustrate this process by constructing a commercial farming sophistication index for South Africa. A multi-step approach was followed for the construction of the commercial farming sophistication index, namely: (1) Selection of items and definition of variables that are most likely to be indicators of commercial farming sophistication; (2) combining of variables into an index; and (3) segmentation and index validation. Following the investigation and illustration of the process of index construction as a method of market segmentation, it was evident that this approach offers an appropriate and useful means of segmenting a market. Several factors contribute to the appeal of this approach. Amongst other, it contributes towards addressing important priorities in the area of future segmentation research, namely that of investigating the application of new base variables into segmentation models, as well as investigating new segmentation strategies. The approach also applies a creative process of combining several base variables into a single measure, namely that of an index variable. By offering classification rules based on characteristics that can easily be observed or elicited by asking a few key questions, new or potential buyers can be grouped by buying behaviour segment. Furthermore, the multi-step process that was employed has pragmatic appeal for researcher, and provides a systematic and structured multivariate approach to segmentation. It also facilitates replication of the process when conducting future studies. By using an index, it takes advantage of any intensity structure that may exist among attributes. This has the advantage that it places members of the market on a continuum that can lead to tracking members’ development paths as they progress towards higher levels on the index. Furthermore, illustration of the process has significant application value in other business-to-business markets, locally and internationally, where index variables can be constructed from both primary and secondary sources and used as a method of segmentation following a similar multi-step approach proposed in this study. Lastly, the outcome of this type of segmentation method offers researchers and marketing practitioners a procedure, in the form of an equation, to calculate index scores and provide rules to segment the market based on predefined intervals. Hence, the challenge to replicate segment formation across independent future studies is addressed. This process is considered an advantage over employing a technique such as cluster analysis, where the use of new data or changes to the clustering algorithm often leads to different segment solutions. / Thesis (PhD)--University of Pretoria, 2012. / Marketing Management / PhD / Unrestricted
|
3 |
Index Construction in Gainsight : A multicriteria decision analysis approachBojsza, Emelie January 2024 (has links)
While a well-built index can measure a complex phenomenon and produce an easy-to-digest output, the construction of an index is vulnerable to errors. Already prominent in a wide range of fields, indices are increasingly leveraged in Customer Success (CS), with all major CS software now offering index construction features. This paper analyzes one such software, Gainsight Customer Success, to explore how it can be used to build an index in line with the constructor’s intentions. Concepts from multicriteria decision analysis (MCDA) illuminate possibilities and pitfalls in executing key steps of index construction in the software: value functions in exploring normalization; the distinction between “importance measures” and “trade-off ratios” in examining the meaning of the weights; the concept of compensability in guiding our aggregation analysis. Finally, the MCDA concept of value trees highlights both weighting and aggregation approaches. We find that the Gainsight user must possess some index construction expertise in order to control normalization, weighting, and aggregation, or even to understand how settings related to these steps affect the total score of an index built in the software. Importantly, neither the meaning of the weights as applied in the tool, nor the level of compensability allowed for in aggregation, are transparent to the user. In examining these questions of how construction choices affect the meaning of an index’s output, this analysis may be consulted for guidance by CS practitioners looking to build useful indices in any software.
|
4 |
Measuring TokenomicsLin, Min-Bin 01 June 2023 (has links)
Die Blockchain-Technologie revolutioniert die Interaktion zwischen Menschen durch Peer-to-Peer-Netzwerke, Kryptografie und Konsensalgorithmen. Trustless Trust ermöglicht sichere und transparente Transaktionen ohne Zwischenhändler. Trotz der zunehmenden Beliebtheit von Krypto-Assets und den damit verbundenen „Tokenomics“ hat die Öffentlichkeit immer noch kein umfangreiches Wissen über die Funktionsweisen dieser Technologie, und ein Großteil des Diskurses bleibt spekulativ.
Das Hauptziel dieser Arbeit ist, die grundlegenden Prinzipien von Krytowährungen (Cryptos) und Non-Fungible Tokens (NFTs) zu untersuchen sowie eine Korrelation zwischen der Technologie und ihren Auswirkungen auf die Wirtschaft aus statistischer und wirtschaftlicher Sicht herzustellen. Um dieses Ziel zu erreichen, wird in den Kapiteln 2 und 3 der Einfluss der Blockchain-Technologie auf Ökonomie und Funktionsweise von Kryptowährungen anhand ökonometrischer Modelle und Clustering-Techniken untersucht.
Kapitel 3 untersucht Kryptowirschaft und Blockchain-Funktionalität anhand empirischer Methoden, insbesondere für Coincreatoren und Investoren. Wir zeigen am Beispiel von Ethereum, dass die wirtschaftliche Leistung von Kryptowährungen durch die Gestaltung der ihnen zugrunde liegenden Blockchain-Technologie beeinflusst werden kann.
Kapitel 4 untersucht die partiellen Korrelationen von Bitcoin-Renditen über neun verschiedene Zentralbörsen aus der Perspektive eines hochfrequenten, dynamischen Netzwerks. Die vorgeschlagene MHAR-CM liefert Kovarianzschätzungen, die die Besonderheiten der Kryptomärkte berücksichtigen. Das Kapitel zeigt Spillover- und Third-Party-Risiken zwischen diesen Börsen.
Kapitel 5 verwendet eine Hedonische Bewertungsmethode, um den DAI Digital Art Index basierend auf dem NFT-Kunstmarkt zu konstruieren. Ein besonderer Fokus liegt auf der Nivellierung der Auswirkungen von Ausreißern mit einer einstufigen robusten Regressions-Huberisierung und einem dynamic conditional score model.
Diese Arbeit verknüpft neue Technologien und Wirtschaft durch statistische Modellierung und Analyse. Durch die Bereitstellung empirischer Belege beobachten wir, wie die Blockchain-Technologie unsere Wahrnehmung von Geld, Kunst und anderen Branchen verändert. / The emergence of distributed ledger technologies, such as blockchain, has revolutionized how individuals interact by enabling "trust-less trust" through peer-to-peer networks, cryptography, and consensus algorithms. This technology eliminates intermediaries and provides secure, transparent transaction methods. However, public understanding of this technology, along with "Tokenomics", remains limited, resulting in speculative discourse.
The main objective of this thesis is to investigate the fundamental principles of cryptocurrencies (cryptos) and non-fungible tokens (NFTs) and establish a correlation between the technology and its economic impact from statistical and economic perspectives. To achieve this, Chapters 2 and 3 explore the influence of blockchain technology on the economic and functional performance of cryptos using econometric models and clustering techniques.
Chapter 3 presents an empirical framework that offers insights to coin creators and investors regarding the interplay between cryptonomics, blockchain functionality, and market dynamics. The economic performance of cryptocurrencies, illustrated with Ethereum as an example, is shown to be affected by the design of their underlying blockchain technology.
Chapter 4 examines partial correlations of Bitcoin returns across nine centralized exchanges from a high-frequency dynamic network perspective. The proposed MHAR-CM provides reasonable covariance estimates that account for the unique characteristics of crypto markets. This chapter uncovers spillover risk and counterparty risk among these exchanges.
In Chapter 5, a hedonic regression approach is employed to construct the DAI digital art index for the NFT art market. Special emphasis is given to mitigating the impact of outliers using one-step robust regression Huberization and a dynamic conditional score model. The DAI index enhances our understanding of this emerging art market and facilitates observation of its macroeconomic trends.
This thesis establishes a connection between emerging technologies and the economy through statistical modeling and analysis. By providing empirical evidence, we gain insights into how blockchain technology is transforming our perceptions of money, art, and various industries.
|
Page generated in 0.1192 seconds