• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1009
  • 224
  • 97
  • 96
  • 68
  • 31
  • 29
  • 19
  • 19
  • 14
  • 12
  • 8
  • 7
  • 7
  • 7
  • Tagged with
  • 2068
  • 743
  • 703
  • 579
  • 435
  • 355
  • 327
  • 308
  • 225
  • 221
  • 192
  • 189
  • 174
  • 165
  • 160
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
151

Scalable Community Detection in Massive Networks using Aggregated Relational Data

Jones, Timothy January 2019 (has links)
The analysis of networks is used in many fields of study including statistics, social science, computer sciences, physics, and biology. The interest in networks is diverse as it usually depends on the field of study. For instance, social scientists are interested in interpreting how edges arise, while biologists seek to understand underlying biological processes. Among the problems being explored in network analysis, community detection stands out as being one of the most important. Community detection seeks to find groups of nodes with a large concentration of links within but few between. Inferring groups are important in many applications as they are used for further downstream analysis. For example, identifying clusters of consumers with similar purchasing behavior in a customer and product network can be used to create better recommendation systems. Finding a node with a high concentration of its edges to other nodes in the community may give insight into how the community formed. Many statistical models for networks implicitly define the notion of a community. Statistical inference aims to fit a model that posits how vertices are connected to each other. One of the most common models for community detection is the stochastic block model (SBM) [Holland et al., 1983]. Although simple, it is a highly expressive family of random graphs. However, it does have its drawbacks. First, it does not capture the degree distribution of real-world networks. Second, it allows nodes to only belong to one community. In many applications, it is useful to consider overlapping communities. The Mixed Membership Stochastic Blockmodel (MMSB) is a Bayesian extension of the SBM that allows nodes to belong to multiple communities. Fitting large Bayesian network models quickly become computationally infeasible when the number of nodes grows into the hundred of thousands and millions. In particular, the number of parameters in the MMSB grows as the number of nodes squared. This thesis introduces an efficient method for fitting a Bayesian model to massive networks through use of aggregated relational data. Our inference method converges faster than existing methods by leveraging nodal information that often accompany real world networks. Conditioning on this extra information leads to a model that admits a parallel variational inference algorithm. We apply our method to a citation network with over three million nodes and 25 million edges. Our method converges faster than existing posterior inference algorithms for the MMSB and recovers parameters better on simulated networks generated according to the MMSB.
152

Biological database indexing and its applications.

January 2002 (has links)
Cheung Ching Fung. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2002. / Includes bibliographical references (leaves 71-73). / Abstracts in English and Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Biological Sequences --- p.2 / Chapter 1.2 --- User Queries on Biological Sequences --- p.4 / Chapter 1.3 --- Research Contributions --- p.6 / Chapter 1.4 --- Organization of Thesis --- p.6 / Chapter 2 --- Background --- p.7 / Chapter 2.1 --- What is a Suffix-Tree? --- p.7 / Chapter 2.2 --- Disk-Based Suffix-Trees --- p.9 / Chapter 3 --- Disk-Based Suffix Tree Constructions --- p.11 / Chapter 3.1 --- An Existing Algorithm: PrePar-Suff ix --- p.11 / Chapter 3.1.1 --- "Three Issues: Edge Splitting, Random Access and Data Skew" --- p.13 / Chapter 3.2 --- DynaCluster-Suffix: A New Novel Disk-Based Suffix-Tree Construction Algorithm --- p.18 / Chapter 4 --- Suffix Links Rebuilt --- p.29 / Chapter 4.1 --- Suffix-links and Least Common Ancestors --- p.29 / Chapter 5 --- q-Length Exact Sequence Matching --- p.35 / Chapter 5.1 --- q-Length Exact Sequence Matching by Suffix-Tree --- p.35 / Chapter 6 --- Implementation --- p.38 / Chapter 6.1 --- System Overview --- p.38 / Chapter 6.1.1 --- Index Builder --- p.39 / Chapter 6.1.2 --- Exact Query Processor --- p.39 / Chapter 6.1.3 --- Suffix Links Regenerator --- p.40 / Chapter 6.1.4 --- Tandem Repeats Finder --- p.40 / Chapter 6.2 --- Data Structures --- p.40 / Chapter 6.2.1 --- Representation of a Node --- p.40 / Chapter 6.2.2 --- An Alternative Node Representation --- p.42 / Chapter 6.2.3 --- Representation of a Leaf --- p.43 / Chapter 6.3 --- Buffering --- p.44 / Chapter 6.3.1 --- Page Format --- p.44 / Chapter 6.3.2 --- Address Translation --- p.45 / Chapter 6.3.3 --- Page Replacement Strategies --- p.45 / Chapter 7 --- A Performance Studies --- p.48 / Chapter 7.1 --- When Everything Can be Held In Memory --- p.52 / Chapter 7.2 --- When Main Memory Is Limited --- p.54 / Chapter 7.3 --- The Effectiveness of DNA Lengths with Fixed Memory Sizes . --- p.56 / Chapter 7.4 --- The Effectiveness of Memory Sizes --- p.57 / Chapter 7.5 --- Answering q-Length Exact Sequence Matching Queries --- p.60 / Chapter 7.6 --- Suffix Link Rebuilt --- p.61 / Chapter 8 --- Conclusions and Future Works --- p.69 / Chapter 8.1 --- Conclusions --- p.69 / Chapter 8.2 --- Future Works --- p.70 / Bibliography --- p.71
153

An improved method for database design.

January 2004 (has links)
Chan, Chi Wai Alan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2004. / Includes bibliographical references (leaves 121-126). / Abstracts in English and Chinese. / Abstract --- p.v / Acknowledgements --- p.viii / List of Figures --- p.ix / List of Tables --- p.xi / Chapter 1. --- Introduction --- p.12 / Chapter 1.1. --- Object-oriented databases --- p.12 / Chapter 1.2. --- Object-oriented Data Model --- p.14 / Chapter 1.3. --- Class and Object Instances --- p.15 / Chapter 1.4. --- Inheritance --- p.16 / Chapter 1.5. --- Constraint --- p.18 / Chapter 1.6. --- Physical Design for OODB Storage --- p.19 / Chapter 1.7. --- Problem Description --- p.20 / Chapter 1.8. --- Genetic Algorithm --- p.22 / Chapter 1.8.1. --- Constraint Handling Methods in GA --- p.25 / Chapter 1.9. --- Contributions of this work --- p.27 / Chapter 1.10. --- Outline of this work --- p.30 / Chapter 2. --- Literature Review --- p.32 / Chapter 2.1. --- Object-oriented database --- p.32 / Chapter 2.2. --- Object-Oriented Data model --- p.33 / Chapter 2.3. --- Physical Storage Model for OODBs --- p.35 / Chapter 2.3.1. --- Home Class (HC) Model --- p.36 / Chapter 2.3.2. --- Repeated Class (RC) Model --- p.38 / Chapter 2.3.3. --- Split Instance (SI) Model --- p.39 / Chapter 2.4. --- Solving physical storage design for OODBs --- p.40 / Chapter 2.5. --- Transaction-Based Approach --- p.41 / Chapter 2.6. --- Minimize database operational cost --- p.42 / Chapter 2.7. --- Combinational Optimization Method --- p.43 / Chapter 2.8. --- Research in Genetic Algorithm --- p.46 / Chapter 2.9. --- Implementation in GA --- p.47 / Chapter 2.10. --- Fitness function --- p.49 / Chapter 2.11. --- Crossover operation --- p.50 / Chapter 2.12. --- Encoding and Representation --- p.51 / Chapter 2.13. --- Parent Selection in Crossover Operation --- p.52 / Chapter 2.14. --- Reproductive selection --- p.53 / Chapter 2.14.1. --- Selection of Crossover Operator --- p.54 / Chapter 2.14.2. --- Replacement --- p.54 / Chapter 2.15. --- The Use of Constraint Handling Method --- p.55 / Chapter 2.15.1. --- Penalty function --- p.56 / Chapter 2.15.2. --- Decoder gives instruction to build feasible solution --- p.57 / Chapter 2.15.3. --- Adjustment method --- p.58 / Chapter 3. --- Solving Physical Storage Problem for OODB using GA --- p.60 / Chapter 3.1. --- Physical storage models for OODB --- p.61 / Chapter 3.2. --- Database operation for transactions --- p.62 / Chapter 3.3. --- Properly designed physical storage structure --- p.68 / Chapter 3.4. --- Fitness Evaluation --- p.69 / Chapter 3.5. --- Initial population --- p.72 / Chapter 3.6. --- Cross-breeding --- p.72 / Chapter 3.7. --- GA Operators --- p.74 / Chapter 3.8. --- Physical Design Problem Formulation for GA --- p.75 / Chapter 3.9. --- Representation and Encoding --- p.75 / Chapter 3.10. --- Solving Physical Storage Problem for OODB in GA --- p.76 / Chapter 3.10.1. --- Representation of design solution --- p.76 / Chapter 3.10.2. --- Encoding --- p.78 / Chapter 3.10.3. --- Initial population --- p.80 / Chapter 3.10.4. --- Parent Selection for breeding --- p.80 / Chapter 3.11. --- Traditional Constraint handling method --- p.83 / Chapter 3.11.1. --- Improve the Performance of Inheritance Constraint Handling methods --- p.85 / Chapter 3.12. --- Weakness in Gorla's GA approach --- p.87 / Chapter 4. --- Proposed Methodology --- p.88 / Chapter 4.1 --- Enhanced Crossover Operator --- p.90 / Chapter 4.2. --- Infeasible Solutions and Enhanced Adjustment Method --- p.93 / Chapter 4.3. --- Propagation Adjustment Method --- p.97 / Chapter 5. --- Computational Experiments --- p.99 / Chapter 5.1. --- Introduction --- p.99 / Chapter 5.2. --- Experiment Objective --- p.101 / Chapter 5.3. --- Tools and Setup --- p.102 / Chapter 5.4. --- Crossover Operator --- p.105 / Chapter 5.5. --- Mutation Operator --- p.105 / Chapter 5.6. --- Termination condition --- p.106 / Chapter 5.7. --- Computational Experiments --- p.107 / Chapter 5.7.1. --- An Illustrative Example ´ؤ UNIVERSITY database --- p.107 / Chapter 5.7.2. --- Simulation ´ؤ 9 classes and 25 classes --- p.115 / Chapter 5.7.3. --- Result --- p.116 / Chapter 6. --- Conclusions --- p.118 / Chapter 6.1. --- Summary of Achievements --- p.118 / Chapter 7. --- Bibliography --- p.121 / Chapter 8. --- Appendix --- p.127
154

Universalism and particularism : explaining the emergence and growth of regional journal indexing systems

Chavarro Bohórquez, Diego Andrés January 2017 (has links)
Journal indexing systems (JIS) are bibliographic databases that are used to search for scientific literature and for bibliometric analyses. This thesis addresses the emergence and growth of regional JIS, focusing on the Scientific Library Online (Scielo) and the Red de Revistas Científicas de América Latina, el Caribe, España, y Portugal (RedALyC) in a challenging environment in which the Web of Science (WoS) and Scopus prevail. WoS and Scopus are referred to as mainstream JIS and Scielo and RedALyC as alternative JIS. The research questions are: (1) Why did alternative JIS emerge in light of the dominance of WoS? (2) Why do researchers publish in journals indexed by alternative JIS? The research draws on the concepts of cognitive authority from information science, and universalism and particularism from the sociology of science. A cognitive authority is an information source that is credible. JIS are becoming cognitive authorities in the science communication system. Their credibility relies on their application of objective criteria to select journals (universalism). However, journal selection can be influenced by subjective criteria (particularism). The tensions between universalism and particularism suggest two scenarios for the emergence and growth of alternative JIS. A universalistic view suggests that they emerge to cover journals with low scientific impact and editorial standards. A particularistic view poses that they emerge to cover disciplinary, linguistic, and regional gaps created by biases in mainstream JIS, particularly in the coverage of WoS. The research questions were addressed through mixed methods to produce quantitative and qualitative evidence. The evidence was obtained from (1) documentary and literature reviews; (2) descriptive and correlational statistics; and (3) a case study that involved interviews with researchers in private and public universities in Colombia in agricultural sciences, business and management, and chemistry. The findings indicate that disciplinary, linguistic, and geographical biases in the coverage of mainstream JIS motivated the development of Scielo and RedALyC. The reasons for their growth have been conceptualised in this thesis as: (1) training; (2) knowledge-gap filling; and (3) knowledge bridging. This thesis addresses a significant gap in the sociology of science by studying new authorities in the science communication system. It contributes to debates on universalism and particularism, showing that both are involved in the selection of journals by JIS. It also contributes to understanding how particularism in mainstream JIS can pose barriers to the communication of scientific knowledge that has the potential to address pressing social demands. The findings could contribute to the design of research policy and research evaluation in contexts not widely covered by mainstream JIS.
155

Comparison of Functional Dependency Extraction Methods and an Application of Depth First Search

Sood, Kanika 29 September 2014 (has links)
Extracting functional dependencies from existing databases is a useful technique in relational theory, database design and data mining. Functional dependencies are a key property of relational schema design. A functional dependency is a database constraint between two sets of attributes. In this study we present a comparative study over TANE, FUN, FD_Mine, FastFDs and Dep_Miner, and we propose a new technique, KlipFind, to extract dependencies from relations efficiently. KlipFind employs a depth-first, heuristic driven approach as a solution. Our study indicates that KlipFind is more space efficient than any of the existing solutions and highly efficient in finding keys for relations.
156

A sem-odb application for the western cultures database

Ghersgorin, Raquel 21 July 1998 (has links)
This thesis presents the evolution of the Western Cultures Database. The project starts with a database design using a Semantic modeling, and continues with the implementation following two techniques: a Relational and a Semantic approach. The project continues with them in parallel, reaching a point where the Relational is left aside because of the advantages of the Semantic (Sem-ODB) approach. The Semantic implementation produces as a result the Western Culture Semantic Database Application - web interface (the main contribution of this thesis). The database is created and populated using Sem ODB and the web interface is built using WebRG (report generator), HTML, JavaScript and JavaChart (applets for graphical representation). The resulting semantic application permits the storage and retrieval of data, the display of reports and the graphical representation of the data through a Web interface. All of these to support research assertions about the impact of historical figures in Western Cultures.
157

Konstruktion och utredning av databasstöd för Länsstyrelsens Kalkningsverksamhet

Moen, Daniel January 2008 (has links)
<p>För databasutveckling finns en mängd regler och standarder framtagna för att databasen ska bli så effektiv och felfri som möjligt. Ibland följs inte dessa fullt ut av olika anledningar. Detta kan leda till problem som i värsta fall kan få konsekvenser för verksamheten som använder sig av den. Ett exempel på en databas där man avvikit från gällande designregler är den som var tänkt att införas för kalkningsverksamheten för Länsstyrelsen i Gävleborg. I det här arbetet har jag gjort en utredning kring denna databas, identifierat dessa avvikelser, utrett varför dessa uppstått och vad dessa har lett till eller eventuellt kan leda till i det här sammanhanget. Jag har också byggt en ny databas enligt gällande regler anpassad för Gävleborgs kalkningsverksamhet. Resultaten jag kommit fram till är att dessa standardavvikelser och felaktigheter troligen beror på en kombination av bristande gränssnitt i form av formulär samt okunskap om databasdesign och administration hos personalen.</p>
158

Discovering Moving Clusters from Spatial-Temporal Databases

Lee, Chien-Ming 28 July 2007 (has links)
Owing to the advances of computer and communication technologies, clustering analysis on moving objects has attracted increasing attention in recent years. An interesting problem is to find the moving clusters composed of objects which move along for a sufficiently long period of time. However, a moving cluster inclines to break after some time because of the goal change in each individual object. In order to identify the set of moving clusters, we propose the formal definition of moving clusters with semantically clear parameters. Based on the definition, we propose delicate approaches to cluster moving objects. The proposed approaches are evaluated using data generated with and without underlying model. We validate our approaches with a through experimental evaluation and comparison.
159

GeoExpert - An Expert System Based Framework for Data Quality in Spatial Databases

Kumar, Aditya 01 August 2006 (has links)
Usage of very large sets of historical spatial data in knowledge discovery process became a common trend, and in order to obtain better results from this knowledge discovery process the data should be of high quality. In this thesis we proposed a framework 'GeoExpert' for data quality assessment and cleansing tool for spatial data that integrates the spatial data visualization and analysis capabilities of the ARCGIS, the reason and inference capability of an expert system. In this thesis we implemented the proposed framework both stand-alone and web versions using ArcGIS Engine and ArcGIS Server, respectively. We used JESS expert system shell for the expert system part of the GeoExpert. Use of expert system shell separates the application logic from the actual framework which makes the framework easily updatable and domain independent. In this thesis we implemented the GeoExpert on the spatially referenced water quality data.
160

Novel applications of data mining methodologies to incident databases

Anand, Sumit 16 August 2006 (has links)
Incident databases provide an excellent opportunity to study the repeated situations of incidents in the process industry. The databases give an insight into the situation which led to an incident, and if studied properly can help monitor the process, equipment and chemical involved more closely, and reduce the number of incidents in the future. This study examined a subset of incidents from National Response Center’s Incident database, focusing mainly on fixed facility incidents in Harris County, Texas from 1990 to 2002. Data mining has been used in the financial and marketing arena for many decades to analyze and find patterns in large amounts of data. Realizing the limited capabilities of traditional methods of statistics, more robust techniques of data mining were applied to the subset of data and interesting patterns of chemical involved, equipment failed, component involved, etc. were found. Further, patterns obtained by data mining on the subset of data were used in modifying probabilities of failure of equipment and developing a decision support system.

Page generated in 0.0912 seconds