Spelling suggestions: "subject:"database"" "subject:"catabase""
551 |
Dionysius : a Peer-to-peer Database Management SystemGuadagnini, Luca January 2009 (has links)
With the introduction of the peer-to-peer paradigm in the world of software, a lot of applications have been created in order to such architecture. Most of them are developed for providing a data sharing service among users connected to a network and programs such as Napster, Gnutella, eMule and BitTorrent have became the so called killer-applications. However some eorts have been spent in order to develop other solutions with the usage of peer-to-peer paradigm. In the case of databases some projects are started with the general purpose of sharing data sets with other databases. Generally they push on the idea of providing the data contained in their database schemes with other peers in the network showing concepts such schema matching, mapping tables and others which are necessary to establish connections and data sending. The thesis analyzes some of such projects in order to see which of them is the most dened and well-supported by concepts and deni- tions. Hyperion Project of the University of Torono in collaboration with the University of Trento is the most promising and it aims to be one of the rst Peer-to-Peer Database Management Systems. However the common idea of considering the peer-to-peer paradigm equal to data sharing - in the way presented by applications such as Napster or others - leads to a lot diculties, it is hard to handle the data sets, some operations must be done manually and there can be some cases where the peer-to-peer paradigm is not applied at all. For this reason the goal is to dene and show the concept of peer-to-peer database built from the scratch with a suitable DBMS for it. A real denition of peer-to-peer database has not been ever made and here for the rst time we tried to give one according to our vision. The denition depends on some precise concepts such global schema - which is the original design of the database -, sub-schema - a well logical dened sub-set of entities of the original schema - and binding tables - necessary to allow the creation of constraints and relations among the entities. Then to show the validity of such concepts and how a management system for peer-to-peer databases can be developed and used, a prototype (named Dionysius) has been realized by modifying HSQLDB - an ordinary DBMS developed in Java - and adding the peer-to-peer platform by using the JXTA libray set.
|
552 |
Analyzing and adapting graph algorithms for large persistent graphsLarsson, Patrik January 2008 (has links)
In this work, the graph database Neo4j developed by Neo Technology is presented together with some of it's functionality when it comes to accessing data as a graph. This type of data access brings the possibility to implement common graph algorithms on top of Neo4j. Examples of such algorithms are presented together with their theoretical backgrounds. These are mainly algorithms for finding shortest paths and algorithms for different graph measures such as centrality measures. The implementations that have been made are presented, as well as complexity analysis and the performance measures performed on them. The conclusions include that Neo4j is well suited for these types of implementations.
|
553 |
Evaluation of Data Integrity Methods in Storage: Oracle DatabasePosse, Oliver, Tomanović, Ognjen January 2015 (has links)
Context. It is very common today that e-commerce systems store sensitiveclient information. The database administrators of these typesof systems have access to this sensitive client information and are ableto manipulate it. Therefore, data integrity is of core importance inthese systems and methods to detect fraudulent behavior need to beimplemented. Objectives. The objective of this thesis is to implement and evaluatethe features and performance impact of different methods for achievingdata integrity in a database, Oracle to be more exact.Methods. Five methods for achieving data integrity were tested.The methods were tested in a controlled environment. Three of themwas tested and performance evaluated by a tool emulating a real lifee-commerce scenario. The focus of this thesis is to evaluate the performanceimpact and the fraud detection ability of the implementedmethods. Results. This paper evaluates traditional Digital signature, Linkedtimestamping applied to a Merkle hash tree and Auditing performanceimpact and feature impact wise. Two more methods were implementedand tested in a controlled environment, Merkle hash tree and Digitalwatermarking. We showed results from the empirical analysis, dataverification and transaction performance. In our evaluation we provedour hypothesis that traditional Digital signature is faster than Linkedtimestamping. Conclusions. In this thesis we conclude that when choosing a dataintegrity method to implement it is of great importance to know whichtype of operation is more frequently used. Our experiments show thatthe Digital signature method performed better than Linked timestampingand Auditing. Our experiments did also conclude that applicationof Digital signature, Linked timestamping and Auditing decreasedthe performance by 4%, 12% and 27% respectively, which is arelatively small price to pay for data integrity.
|
554 |
KlarSynt Tools : A tool for automating configurations of test environmentsVestin, Simon, Svensson, Daniel January 2014 (has links)
Preparing dedicated environments for testing often requires time consuming, manual configurations to be made on databases and the Windows Registry. A proposed Windows application could improve the efficiency and accuracy of such settings by automating the processes and providing a user-friendly graphical user interface. On behalf of Ninetech, a consulting company, such an application was therefore developed - KlarSynt Tools. This application was to enhance the company's previous methods of configuring test environments by removing the need of manual tasks and the use of an unoptimized tool called Verktyg. In the development of the application features such as connecting to servers, retrieving data from databases, and automatic configurations of the Windows Registry was implemented. Problems such as automating manual tasks had to be dealt with for providing accuracy of the configurations. The usage of development patterns such as MVVM was also utilized in the project to provide flexibility in the program code, and in that way prepare the software for future development. Finally, user-friendliness was integrated into the application interface to provide efficiency in the usage of the application. This project resulted in that a Windows application was developed to accurately and efficiently configure settings to a database and the Windows Registry. The developed application showed to significantly reduce the number of steps required and the time taken to perform the configurations in the old process.
|
555 |
Dynamika výskytu orchidejí ve vybraném modelovém území v jižních Čechách / Dynamics of orchid occurrence in South BohemiaKosánová, Kristina January 2017 (has links)
Orchids are an endangered group of plants, protected both in the Czech Republic and in the whole world. Questions of their protection are therefore lively discussed, but not all factors, affecting their presence, are known so far. The purpose of this work was to find out, which environmental factors influence the existence of certain orchid species at their localities in the selected area. This is important for better protection of orchids, because only by knowing these factors we can find new sites, or improve management plans of the existing ones. Another purpose of this work was to find out what is the main reason for extinction of orchids at their historical localities and whether or not there is a possibility of finding other, yet unknown localities of these species. This thesis is based on data from databases, which were also updated during the data collection. The data were processed by computer software MaxEnt, which produces species distribution models and allows to predict potential occurrence of orchids even at yet unknown localities. This software also analyses the environmental factors affecting species presence. I found that the main reason of extinction of orchids at their historical localities was overgrowing. Main environmental factors affecting orchid occurrence were analysed for...
|
556 |
Rattlesnake Envenomation Demographic and Situational Statistics: a Retrospective Database Analysis 2002-2014Reilly, Jessica, Robertson, Morgan, Molina, Deanna, Boesen, Keith January 2016 (has links)
Class of 2016 Abstract and Report / Objectives: The purpose of this study was to assess trends in the anatomical bite location, circumstances, and legitimacy of rattlesnake envenomations managed by the Arizona Poison and Drug Information Center (APDIC) between the years of 2002 to 2014.
Methods: The Institutional Review Board approved this retrospective database analysis in which deidentified patient case information was extracted from the APDIC electronic medical record database. Descriptive and demographic variables collected included: age, gender, anatomical bite location, circumstance, and alcohol involvement. Variables were analyzed by student researchers to determine the legitimacy. Researchers compared demographic variables by year and month to assess for trends.
Results: A total of 1,738 rattlesnake envenomations were analyzed for the 13 year study period. The number of cases per year varied, but not significantly, p=0.069. A statistically significant (p<0.005) upward trend in average age occurred. No significant difference in cases involving females was found between study years (p=0.171). Alcohol involvement was not statistically significant, p=0.46. An upward trend (p<0.005) in legitimate rattlesnake envenomations was demonstrated.
Conclusions: Envenomations from 2002 to 2014, showed an upward trend in age, but similar distribution of gender. An increasing number of envenomations were determined to be legitimate, possibly related to the increasing number occurring to the foot/ankle, as well as the increasing number related to gardening and walking outside/taking out the trash. This trend may also be due to the lack of adequate data related to alcohol involvement.
|
557 |
Packaged software : security and controls audit reviewVan Heerden, Chris 15 September 2015 (has links)
M.Com. / In recent years large organisations that developed mainframe application software in-house are now purchasing software packages to replace these applications. These advanced packages incorporate a high level of integration and include security and control features to ensure that the integrity of input, processing, output and storage are maintained. Computer auditors are required to evaluate these advanced packaged software to ensure that the security and control features are adequate and comply with organisational standards. Furthermore, they must ensure that the integrity of information systems programs and data are maintained ...
|
558 |
Cataloging Tailings Dams in ArizonaChernoloz, Oleksiy, Chernoloz, Oleksiy January 2017 (has links)
Tailings storage facilities (TSFs) and conventional water retaining dams are the largest manmade structures on Earth. Statistics show that TSFs are more likely to fail than water retaining dams.Recent catastrophic failures of TSFs have led to the loss of lives (Germano mine, Brazil), environmental damage (Mount Polley, Canada), contamination of drinking water (Baia Mare, Romania), and the destruction of property (Kingston Fossil Plant, USA). As the scale of mining increases, TSFs increase in height and volume, therefore increasing the consequence of failure. To help mitigate risk associated with large TSFs mining companies empanel expert groups to review operations of TSFs and conduct regular visual inspections. In the US the Mine Safety and Health Administration has regulatory responsibility for the safety of TSFs. As population centers expand nearer to existing and proposed TSFs, the public requires assurance of the integrity of these structures. A pro-active approach to public safety is more desirable than a post-mortem analysis after a major failure.
We have examined both the regulatory practices, the industry practices, and public data on TSFs in Arizona. In this thesis paper we address inadequacies of the official government records on TSFs in the two largest publicly accessible databases of dams inthe US – the National Inventory of Dams (NID), and the National Performance of Dams Program (NPDP). Both databases contain numerous errors and omissions, including descriptions and geographic coordinates of TSFs that are inaccurate by many kilometers. Several large TSFs in Arizona are not included in either database.We address these shortcomings with a pilot project for Arizona that demonstrates recording accurate information in a database is neither expensive nor onerous, communicating best practices for operation can help alleviate community concerns, and continuous monitoring technology can resolve shortcomings with visual inspections.
|
559 |
Using data analysis and Information visualization techniques to support the effective analysis of large financial data setsNyumbeka, Dumisani Joshua January 2016 (has links)
There have been a number of technological advances in the last ten years, which has resulted in the amount of data generated in organisations increasing by more than 200% during this period. This rapid increase in data means that if financial institutions are to derive significant value from this data, they need to identify new ways to analyse this data effectively. Due to the considerable size of the data, financial institutions also need to consider how to effectively visualise the data. Traditional tools such as relational database management systems have problems processing large amounts of data due to memory constraints, latency issues and the presence of both structured and unstructured data The aim of this research was to use data analysis and information visualisation techniques (IV) to support the effective analysis of large financial data sets. In order to visually analyse the data effectively, the underlying data model must produce results that are reliable. A large financial data set was identified, and used to demonstrate that IV techniques can be used to support the effective analysis of large financial data sets. A review of the literature on large financial data sets, visual analytics, existing data management and data visualisation tools identified the shortcomings of existing tools. This resulted in the determination of the requirements for the data management tool, and the IV tool. The data management tool identified was a data warehouse and the IV toolkit identified was Tableau. The IV techniques identified included the Overview, Dashboards and Colour Blending. The IV tool was implemented and published online and can be accessed through a web browser interface. The data warehouse and the IV tool were evaluated to determine their accuracy and effectiveness in supporting the effective analysis of the large financial data set. The experiment used to evaluate the data warehouse yielded positive results, showing that only about 4% of the records had incorrect data. The results of the user study were positive and no major usability issues were identified. The participants found the IV techniques effective for analysing the large financial data set.
|
560 |
The impact of domain knowledge-driven variable derivation on classifier performance for corporate data miningWelcker, Laura Joana Maria January 2015 (has links)
The technological progress in terms of increasing computational power and growing virtual space to collect data offers great potential for businesses to benefit from data mining applications. Data mining can create a competitive advantage for corporations by discovering business relevant information, such as patterns, relationships, and rules. The role of the human user within the data mining process is crucial, which is why the research area of domain knowledge becomes increasingly important. This thesis investigates the impact of domain knowledge-driven variable derivation on classifier performance for corporate data mining. Domain knowledge is defined as methodological, data and business know-how. The thesis investigates the topic from a new perspective by shifting the focus from a one-sided approach, namely a purely analytic or purely theoretical approach towards a target group-oriented (researcher and practitioner) approach which puts the methodological aspect by means of a scientific guideline in the centre of the research. In order to ensure feasibility and practical relevance of the guideline, it is adapted and applied to the requirements of a practical business case. Thus, the thesis examines the topic from both perspectives, a theoretical and practical perspective. Therewith, it overcomes the limitation of a one-sided approach which mostly lacks practical relevance or generalisability of the results. The primary objective of this thesis is to provide a scientific guideline which should enable both practitioners and researchers to move forward the domain knowledge-driven research for variable derivation on a corporate basis. In the theoretical part, a broad overview of the main aspects which are necessary to undertake the research are given, such as the concept of domain knowledge, the data mining task of classification, variable derivation as a subtask of data preparation, and evaluation techniques. This part of the thesis refers to the methodological aspect of domain knowledge. In the practical part, a research design is developed for testing six hypotheses related to domain knowledge-driven variable derivation. The major contribution of the empirical study is concerned with testing the impact of domain knowledge on a real business data set compared to the impact of a standard and randomly derived data set. The business application of the research is a binary classification problem in the domain of an insurance business, which deals with the prediction of damages in legal expenses insurances. Domain knowledge is expressed through deriving the corporate variables by means of the business and data-driven constructive induction strategy. Six variable derivation steps are investigated: normalisation, instance relation, discretisation, categorical encoding, ratio, and multivariate mathematical function. The impact of the domain knowledge is examined by pairwise (with and without derived variables) performance comparisons for five classification techniques (decision trees, naive Bayes, logistic regression, artificial neural networks, k-nearest neighbours). The impact is measured by two classifier performance criteria: sensitivity and area under the ROC-curve (AUC). The McNemar significance test is used to verify the results. Based on the results, two hypotheses are clearly verified and accepted, three hypotheses are partly verified, and one hypothesis had to be rejected on the basis of the case study results. The thesis reveals a significant positive impact of domain knowledge-driven variable derivation on classifier performance for options of all six tested steps. Furthermore, the findings indicate that the classification technique influences the impact of the variable derivation steps, and the bundling of steps has a significant higher performance impact if the variables are derived by using domain knowledge (compared to a non-knowledge application). Finally, the research turns out that an empirical examination of the domain knowledge impact is very complex due to a high level of interaction between the selected research parameters (variable derivation step, classification technique, and performance criteria).
|
Page generated in 0.0592 seconds