641 |
An Automated Conversion Of Temporal Databases Into Xml With Fuzziness OptionIsikman, Omer Ozgun 01 September 2010 (has links) (PDF)
The importance of incorporating time in databases has been well realized by the community and time varying databases have been extensively studied by researchers. The main idea is to model up-to-date changes to data since it became available. Time information is mostly overlaid on the
traditional databases, and extensional time dimension helps in inquiring or past data / this all becomes possible only once the idea is realized and favored by commercial database management systems. Unfortunately, one disadvantage of the temporal database management system is that it
has not been commercialized. Firstly XML ,eXtensible Markup Language, is a defacto standard for data interchange and hence integrating XML as the data model is decided. The motivation for the work described in this thesis is two-fold / transferring databases into XML with changing crisp values into fuzzy variables describing fuzzy sets and second bitemporal databases form one interesting type of temporal databases. Thus, purpose is to suggest a complete automated system that converts any
bitemporal database to its fuzzy XML schema definition. However, the implemented temporal database operators are database content independent. Fuzzy elements are capable of having different membership functions and varying number of linguistic variables. A scheme for determining membership function parameters is proposed.
Finally, fuzzy queries have also been implemented as part of the system.
|
642 |
Toward accurate and efficient outlier detection in high dimensional and large data setsNguyen, Minh Quoc 22 April 2010 (has links)
An efficient method to compute local density-based outliers in high dimensional data was proposed. In our work, we have shown that this type of outlier is present even in any subset of the dataset. This property is used to partition the data set into random subsets to compute the outliers locally. The outliers are then combined from different subsets. Therefore, the local density-based outliers can be computed efficiently. Another challenge in outlier detection in high dimensional data is that the outliers are often suppressed when the majority of dimensions do not exhibit outliers. The contribution of this work is to introduce a filtering method whereby outlier scores are computed in sub-dimensions. The low sub-dimensional scores are filtered out and the high scores are aggregated into the final score. This aggregation with filtering eliminates the effect of accumulating delta deviations in multiple dimensions. Therefore, the outliers are identified correctly. In some cases, the set of outliers that form micro patterns are more interesting than individual outliers. These micro patterns are considered anomalous with respect to the dominant patterns in the dataset. In the area of anomalous pattern detection, there are two challenges. The first challenge is that the anomalous patterns are often overlooked by the dominant patterns using the existing clustering techniques. A common approach is to cluster the dataset using the k-nearest neighbor algorithm. The contribution of this work is to introduce the adaptive nearest neighbor and the concept of dual-neighbor to detect micro patterns more accurately. The next challenge is to compute the anomalous patterns very fast. Our contribution is to compute the patterns based on the correlation between the attributes. The correlation implies that the data can be partitioned into groups based on each attribute to learn the candidate patterns within the groups. Thus, a feature-based method is developed that can compute these patterns efficiently.
|
643 |
Formalisms on semi-structured and unstructured data schema computationsLee, Yau-tat, Thomas. January 2009 (has links)
Thesis (Ph. D.)--University of Hong Kong, 2010. / Includes bibliographical references (p. 115-119). Also available in print.
|
644 |
Duomenų bazės kūrimas panaudojant duomenų pildymo formų šablonus / Technology on creation of a database, using patterns of data inputUrbonas, Tomas 01 September 2011 (has links)
Duomenų bazės projektuojamos pagal specifinius reikalavimus ir tai dažniausiai daroma rankiniu būdu. Suprojektuotai duomenų bazei valdyti kuriamos programinės priemonės, kurios pritaikytos tik tai duomenų bazei valdyti. Gautas rezultatas dažniausiai tenkina vartotoją, tačiau neilgai. Nuolat besikeičiantys reikalavimai duomenims didina atotrūkį tarp norimos apdoroti ir galimos apdoroti informacijos. Taigi išlaikyti duomenų bazę suderintą su poreikiais yra sudėtinga. Nuolat besikeičiantys poreikiai didina informacinės sistemos palaikymo sąnaudas. O tai smulkaus verslo įmonėms gali tapti nepakeliama našta. Kyla poreikis pačiam sistemos vartotojui pakoreguoti ją taip, kad sistema tenkintų pasikeitusius reikalavimus duomenų apdorojimui.
Šiame darbe pristatomas duomenų bazės kūrimas naudojant vartotojo sukurtus duomenų pildymo šablonus. Pagal vartotojo sukurtus šablonus keičiasi ir informacinės sistemos vartotojo sąsaja skirta duomenims apdoroti.
Tikslas – duomenų bazės bei jos valdymo sistemos kūrimo automatizavimas. Siekiama sukurti algoritmą duomenų bazės bei jos valdymo sistemos kūrimo automatizavimui. Pagal šį algoritmą realizuota sistema vartotojas, neišmanantis apie duomenų bazių kūrimą, galėtų sukurti duomenų bazę bei jos valdymo sistemą pritaikytą savo reikmėms. Galėtų bet kada modifikuoti duomenų bazės struktūrą – pridėti, pašalinti stulpelius. Duomenų bazės struktūra koreguojama pagal šablonus. Šablonų sudėtis saugoma duomenų bazės lentelėse. Pagal aprašytą algoritmą... [toliau žr. visą tekstą] / Databases are designed according to specific requirements, and in most cases it is carried out manually. Having database already designed, software tools to manage the particular database are developed. Usually the result meets user’s needs, but just on a short term basis. Since the requirements keep changing constantly, gap between information to be processed and information possible to process in practice is growing bigger, so it is complicated to keep the database updated according to user’s needs. Shifting needs have significant influence on information system support input, which can become a challenge to minor businesses. Therefore particular system users have to modify it themselves to meet the changing requirements of data processing.
This final thesis analyses the development of database using data filling templates designed by the user. According to the templates designed by the user, information system user interface for data processing keeps changing as well.
The purpose is automatic design of database and its management system, and development of algorithm to automate database and its management system. In case the system is based on this algorithm, its user having very little knowledge of database development still is able to design a database and to shape its management system according to his needs. It is possible to modify database structure – to add or to remove columns – at any time. Database structure is adjusted by the templates kept in database tables... [to full text]
|
645 |
Duomenų prieinamumo ir saugumo duomenų bazėse metodiniai nurodymai / Methodical instructions for data access and security in databasesNaujokas, Tomas 05 November 2013 (has links)
Šio darbo tikslas buvo sukurti reikalavimais grindžiamą duomenų saugumo ir matomumo metodų pasirinkimo bei jų kombinavimo metodinę medžiagą. Pateikti reikalavimas grindžiamą kompleksinės saugos modelį. Pašalinti kompleksinės apsaugos metodų informacijos trūkumą. Darbe išnagrinėti ir palyginti garsių pasaulio saugos specialistų kompleksinės saugos sprendimai. Atlikta sistemos pažeidimų analizė ir sisteminimas. Darbe buvo siekiama atskleisti svarbiausius pažeidimus, jų veikimo principus ir kaip nuo jų tinkamai apsisaugoti. Praktinėje dalyje aprašytas kompleksinis saugos modelis, kuris vėliau smulkinamas į detalius apsaugos modelius. Modelis buvo pritaikytas šiandien populiariausiose kombinuotoje sistemose Microsoft Windows Server 2008 serveryje ir Microsoft SQL Server 2008 duomenų bazių valdymo sistemoje. Naudojantis metodika galima atlikti esamos sistemos saugumo analizę ir remiantis veiklos modeliais teisingai konfigūruoti esamą ar naujai kuriamą sistemą. / This work destination was to create requirements based on data security and availability method choosing and their combination methodology. Introduce requirements based on complex security model. Eliminate information lack of complex security. In work analyzed and compared complex security solutions of famous the word security specialists. Accomplished system vulnerability analysis and systematized information. During the work revealed most important vulnerabilities, explain how it works and how correctly secure of them. In this research described security model of complex security, which later detailed as smaller part of model. Complex security model were used at nowadays most popular combined information system. For research were used Microsoft Windows Server 2008 and Microsoft SQL Server 2008. Created methodology is useful then necessary to test existing or creating new configuration of system.
|
646 |
Computational Verification of Published Human Mutations.Kamanu, Frederick Kinyua. January 2008 (has links)
<p>The completion of the Human Genome Project, a remarkable feat by any measure, has provided over three billion bases of reference nucleotides for comparative studies. The next, and perhaps more challenging step is to analyse sequence variation and relate this information to important phenotypes. Most human sequence variations are characterized by structural complexity and, are hence, associated with abnormal functional dynamics. This thesis covers the assembly of a computational platform for verifying these variations, based on accurate, published, experimental data.</p>
|
647 |
The need for object-oriented systems to extend or replace the relational database model to solve performance problemsGibson, Mark G. January 1992 (has links)
The relational model has dominated the database field because of its reduced application development time and non-procedural data manipulation features. It has significant problems, however, including weak integrity constraints. This paper discusses the need for object oriented techniques to improve on these flaws. Three existing DBMS will be discussed: IRIS, ORION, and OZ. / Department of Computer Science
|
648 |
An Mpeg-7 Video Database System For Content-based Management And RetrievalCelik, Cigdem 01 October 2005 (has links) (PDF)
A video data model that allows efficient and effective representation and querying of spatio-temporal properties of objects has been previously developed. The data model is focused on the semantic content of video streams. Objects, events, activities performed by objects are the main interests of the model. The model supports fuzzy spatial queries including querying spatial relationships between objects and querying the trajectories of objects. In this thesis, this work is used as a basis for the development of an XML-based video database system. This system is aimed to be compliant with the MPEG-7 Multimedia Description Schemes in order to obey a universal standard. The system is implemented using a native XML database management system. Query entrance facilities are enhanced via integrating an NLP interface.
|
649 |
Efficient computation of advanced skyline queries.Yuan, Yidong, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
Skyline has been proposed as an important operator for many applications, such as multi-criteria decision making, data mining and visualization, and user-preference queries. Due to its importance, skyline and its computation have received considerable attention from database research community recently. All the existing techniques, however, focus on the conventional databases. They are not applicable to online computation environment, such as data stream. In addition, the existing studies consider efficiency of skyline computation only, while the fundamental problem on the semantics of skylines still remains open. In this thesis, we study three problems of skyline computation: (1) online computing skyline over data stream; (2) skyline cube computation and its analysis; and (3) top-k most representative skyline. To tackle the problem of online skyline computation, we develop a novel framework which converts more expensive multiple dimensional skyline computation to stabbing queries in 1-dimensional space. Based on this framework, a rigorous theoretical analysis of the time complexity of online skyline computation is provided. Then, efficient algorithms are proposed to support ad hoc and continuous skyline queries over data stream. Inspired by the idea of data cube, we propose a novel concept of skyline cube which consists of skylines of all possible non-empty subsets of a given full space. We identify the unique sharing strategies for skyline cube computation and develop two efficient algorithms which compute skyline cube in a bottom-up and top-down manner, respectively. Finally, a theoretical framework to answer the question about semantics of skyline and analysis of multidimensional subspace skyline are presented. Motived by the fact that the full skyline may be less informative because it generally consists of a large number of skyline points, we proposed a novel skyline operator -- top-k most representative skyline. The top-k most representative skyline operator selects the k skyline points so that the number of data points, which are dominated by at least one of these k skyline points, is maximized. To compute top-k most representative skyline, two efficient algorithms and their theoretical analysis are presented.
|
650 |
Efficient computation of advanced skyline queries.Yuan, Yidong, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
Skyline has been proposed as an important operator for many applications, such as multi-criteria decision making, data mining and visualization, and user-preference queries. Due to its importance, skyline and its computation have received considerable attention from database research community recently. All the existing techniques, however, focus on the conventional databases. They are not applicable to online computation environment, such as data stream. In addition, the existing studies consider efficiency of skyline computation only, while the fundamental problem on the semantics of skylines still remains open. In this thesis, we study three problems of skyline computation: (1) online computing skyline over data stream; (2) skyline cube computation and its analysis; and (3) top-k most representative skyline. To tackle the problem of online skyline computation, we develop a novel framework which converts more expensive multiple dimensional skyline computation to stabbing queries in 1-dimensional space. Based on this framework, a rigorous theoretical analysis of the time complexity of online skyline computation is provided. Then, efficient algorithms are proposed to support ad hoc and continuous skyline queries over data stream. Inspired by the idea of data cube, we propose a novel concept of skyline cube which consists of skylines of all possible non-empty subsets of a given full space. We identify the unique sharing strategies for skyline cube computation and develop two efficient algorithms which compute skyline cube in a bottom-up and top-down manner, respectively. Finally, a theoretical framework to answer the question about semantics of skyline and analysis of multidimensional subspace skyline are presented. Motived by the fact that the full skyline may be less informative because it generally consists of a large number of skyline points, we proposed a novel skyline operator -- top-k most representative skyline. The top-k most representative skyline operator selects the k skyline points so that the number of data points, which are dominated by at least one of these k skyline points, is maximized. To compute top-k most representative skyline, two efficient algorithms and their theoretical analysis are presented.
|
Page generated in 0.0192 seconds