Global ETD Search

1	Ad Hoc Integration and Querying of Heterogeneous Online Distributed Databases Chen, Liangyou 07 August 2004 (has links) This dissertation provides an ad hoc integration methodology to manage and integrate heterogeneous online distributed databases on demand. The problem arises from an impending demand from scientific users to conveniently manage existing Web data along with the complexity involved in the construction of a functional data federation system using existing data integration technologies. We close this gap with a databases management framework accompanying novel Web data specification languages, wrapper generation technologies, and distributed query processing techniques. A major achievement of this dissertation is the establishment of a sound relational data model for Web data. Under this model, the Web becomes a synthetic extension of the traditional database systems. Consequently, a novice user of our system can cheaply integrate a large number of distributed Web sources with in-house databases for daily scientific data analysis purpose. The relational Web modeling leads to a practical ad hoc integration system - the Meteoroid system (a MEthodology for ad hoc inTEgration of Online distributed heteROgeneous Internet Data) - in the context of biological data interoperability. We identify that a main difficulty for ad hoc integration lies in the lack of a fully automated wrapper generation and maintenance technique for general semi-structured data such as HTML, XML and plain text documents. We address this issue through a thorough study of characteristics of online Web data and devise various automated wrapper techniques to facilitate robust data wrapping tasks. With this technique, form-based Web data and table-based Web data can be treated like traditional relational databases. A seamless interoperation environment for Web data and in-house databases is possible. Another difficulty impeding ad hoc integration is in the query processing for heterogeneous distributed sources, where conflict of data is common and on demand mediation of distributed sources is desirable. The dynamicity and unpredictability of Web data further complicate the query processing task. We studied limitations posed by the Web environment for integration query processing and developed innovative techniques to expedite the early appearance of available results. Finally we demonstrate a prototype system for ad hoc integration of heterogeneous biological data. In the system, visual Web-based interfaces guide the integration of heterogeneous data for novice users. A declarative environment is supported for ad hoc querying and management of distributed data sources. wrapper construction
2	The Khmer Sampot : an evolving tradition Perry, Liz, n/a January 1995 (has links) The Khmer Sampot: An Evolving Tradition examines the history of the Khmer hip-wrapper, specifically the sampot. and its place within Khmer society. The thesis suggests that the continuation of the tradition of making and wearing the sampot is an indicator of what is important within Khmer society. Evidence of the sampot's early form comes from many sources, including Angkorian sculpture and inscriptions; from notes made by the Chinese emissary Chou Ta-Kuan who lived at Angkor in 1296AD; traders in the region around the fifteenth century; later European explorers such as Henri Mouhot; early twentieth century travellers, scholars and French administrators; later twentieth century anthropologists notes, Cambodian journals, interviews with Cambodian people and visits to Cambodia. Using the above evidence, the sampot's forms and functions within Khmer society from ancient times to the present day are examined and discussed. The varieties of sampot. the motifs, colours, types of cloth and methods of weaving are considered. Also considered are the sampot's functions, ie as everyday dress, ceremonial dress and the economic function of the sampot within Khmer society. The thesis notes that during the twentieth century alone there have been two events which could have caused the demise of traditional sampot weaving, one of which was the flood of imported goods to Indochina during the early years of the twentieth cntury, resulting in a lack of interest in local goods and the subsequent lack of production of local goods such as cloth. The other event was Pol Pot's reign of Cambodia during 1975-79, when the population wore a black uniform. In the case of the first event, it was the French who realised that encouraging the traditional skills to resurface was essential if these skills were not to be lost. However in the case of the second event, it appears to have been the Cambodian people themselves who, after the devastating events of the late 1970's, recommenced their tradition of making and wearing sampot as a way of expressing their cultural identity. Khmer Sampot hip-wrapper Cambodia
3	Rozšiřitelný provider pro Windows PowerShell / Extensible Provider for Windows Powershell Závišek, Josef January 2011 (has links) This thesis deals with the design and implementation of an extensible provider for Windows PowerShell. This provider allows registering the adapters which provide access to various data stores. The thesis gives an introduction into PowerShell and outlines how to realize new extensions. It then elaborates the architecture of the provider in detail. Next part is devoted to the design and implementation of the adapter for compressed files. For this purpose, the SevenZip library is used which had to be adapted for the use from C# language. Therefore, the thesis also includes description of the wrapper allowing the library utilization from the managed code.
4	A Cadence layout wrapper for MATLAB Tsirepli, Ismini January 2006 (has links) <p>In this thesis, the focus is on creating a wrapper between MATLAB and the Cadence Virtuoso design environment. The central idea is to use the wrapper and write the code for an entire analog layout as scripts in MATLAB. Basically, we will implement a set of necessary commands for performing the most fundamental tasks in layout generation from within MATLAB.</p> SKILL MATLAB wrapper layout generation, Electronics Elektronik
5	A Cadence layout wrapper for MATLAB Tsirepli, Ismini January 2006 (has links) In this thesis, the focus is on creating a wrapper between MATLAB and the Cadence Virtuoso design environment. The central idea is to use the wrapper and write the code for an entire analog layout as scripts in MATLAB. Basically, we will implement a set of necessary commands for performing the most fundamental tasks in layout generation from within MATLAB. SKILL MATLAB wrapper layout generation, Electronics Elektronik
6	Extrakce dat z popisu zboží / Data Extraction from Product Descriptions Sláma, Vojtěch January 2008 (has links) This work concentrates on the design and implementation of an automated support for data extraction from product descriptions. This system will be used for e-shop purposes. The work introduces present approaches to information extraction from HTML documents. It focuses chiefly at wrappers and methods for their induction. The visual approach to information extraction is also mentioned. System requirements and basic principles are described in the design part of the work. Next, a detailed description of a path tracing algorithm in document object model is explained. The last section of the work evaluates the results of experiments made with the implemented system.
7	Софтверски систем за преузимање библиографских записа / Softverski sistem za preuzimanje bibliografskih zapisa / System for retrieval of bibliographic records Boberić Danijela 02 July 2010 (has links) <p style="text-align: justify;">Извршено је моделирање и имплементација система који омогућава претраживање и преузимање библиотечких записа по дефинисаним стандардима. Систем је базиран на сервис-оријенисаној архитектури и mediator/wrapper шаблону. Систем је имплементиран у програмском језику Java, а модел је приказан у UML 2.0 нотацији. У оквиру система развијени су сервиси који представљају серверске стране за протокол Z39.50 и SRU и развијена је посебна софтверска компонента која омогућава интеграцију тих сервиса са постојећим библиотечким системом. Верификација овог система извршена је интеграцијом у софтверски систем БИСИС верзије 4. </p><p style="text-align: justify;"><br />Такође, показано је да се упит формиран помоћу Z39.50 упитног језика може трансформисати у упит који је дефинисан SRU упитним језиком. Дата је и трансформација SRU упитног језика у Lucene упитни језик. Дат је предлог проширења SRU стандарда у циљу да се овај стандард користи и за комуникацију<br />између клијента и сервера када је потребно снимање података у удаљену базу података.</p> / <p style="text-align: justify;">Izvršeno je modeliranje i implementacija sistema koji omogućava pretraživanje i preuzimanje bibliotečkih zapisa po definisanim standardima. Sistem je baziran na servis-orijenisanoj arhitekturi i mediator/wrapper šablonu. Sistem je implementiran u programskom jeziku Java, a model je prikazan u UML 2.0 notaciji. U okviru sistema razvijeni su servisi koji predstavljaju serverske strane za protokol Z39.50 i SRU i razvijena je posebna softverska komponenta koja omogućava integraciju tih servisa sa postojećim bibliotečkim sistemom. Verifikacija ovog sistema izvršena je integracijom u softverski sistem BISIS verzije 4. </p><p style="text-align: justify;"><br />Takođe, pokazano je da se upit formiran pomoću Z39.50 upitnog jezika može transformisati u upit koji je definisan SRU upitnim jezikom. Data je i transformacija SRU upitnog jezika u Lucene upitni jezik. Dat je predlog proširenja SRU standarda u cilju da se ovaj standard koristi i za komunikaciju<br />između klijenta i servera kada je potrebno snimanje podataka u udaljenu bazu podataka.</p> / <p> Modeling and implementation of software system for retrieval of bibliographic records using defined standard has been done. System is based on service – oriented architecture as well as on mediator/wrapper architecture. System implementation is realized in programming language Java and modelling of system is performed using UML 2.0. Also, services presenting server side of protocols Z39.50 and SRU have been developed. In addition, software component based on mediator approach used for connecting services for retrieval with legacy system is developed. Verification of described system is done by integration of that system into library system BISIS, version 4. Moreover, it is proved that transformations of Z39.50 query into SRU query are possible, and it has been made a suggestion how to transform SRU query into Lucene query. Also, it has been made suggestion how to extend existing SRU standard in order to use that extension when it is necessary to update bibliographic records on remote databases via Internet.</p>
8	The role of classifiers in feature selection : number vs nature Chrysostomou, Kyriacos January 2008 (has links) Wrapper feature selection approaches are widely used to select a small subset of relevant features from a dataset. However, Wrappers suffer from the fact that they only use a single classifier when selecting the features. The problem of using a single classifier is that each classifier is of a different nature and will have its own biases. This means that each classifier will select different feature subsets. To address this problem, this thesis aims to investigate the effects of using different classifiers for Wrapper feature selection. More specifically, it aims to investigate the effects of using different number of classifiers and classifiers of different nature. This aim is achieved by proposing a new data mining method called Wrapper-based Decision Trees (WDT). The WDT method has the ability to combine multiple classifiers from four different families, including Bayesian Network, Decision Tree, Nearest Neighbour and Support Vector Machine, to select relevant features and visualise the relationships among the selected features using decision trees. Specifically, the WDT method is applied to investigate three research questions of this thesis: (1) the effects of number of classifiers on feature selection results; (2) the effects of nature of classifiers on feature selection results; and (3) which of the two (i.e., number or nature of classifiers) has more of an effect on feature selection results. Two types of user preference datasets derived from Human-Computer Interaction (HCI) are used with WDT to assist in answering these three research questions. The results from the investigation revealed that the number of classifiers and nature of classifiers greatly affect feature selection results. In terms of number of classifiers, the results showed that few classifiers selected many relevant features whereas many classifiers selected few relevant features. In addition, it was found that using three classifiers resulted in highly accurate feature subsets. In terms of nature of classifiers, it was showed that Decision Tree, Bayesian Network and Nearest Neighbour classifiers caused signficant differences in both the number of features selected and the accuracy levels of the features. A comparison of results regarding number of classifiers and nature of classifiers revealed that the former has more of an effect on feature selection than the latter. The thesis makes contributions to three communities: data mining, feature selection, and HCI. For the data mining community, this thesis proposes a new method called WDT which integrates the use of multiple classifiers for feature selection and decision trees to effectively select and visualise the most relevant features within a dataset. For the feature selection community, the results of this thesis have showed that the number of classifiers and nature of classifiers can truly affect the feature selection process. The results and suggestions based on the results can provide useful insight about classifiers when performing feature selection. For the HCI community, this thesis has showed the usefulness of feature selection for identifying a small number of highly relevant features for determining the preferences of different users. 005.3
9	Utilización de Support Vector Machines No Lineal y Selección de Atributos para Credit Scoring Maldonado Alarcón, Sebastián Alejandro January 2007 (has links) No description available. Ingeniería SVM Selección de atributos Credit scoring Minería de datos Algoritmo Wrapper
10	Induction interactive d'extracteurs n-aires pour les documents semi-structurés Marty, Patrick 04 December 2007 (has links) (PDF) La thèse défendue dans ce mémoire est qu'il est possible de concevoir des algorithmes d'apprentissage de programmes d'extraction n-aire pour les documents semi-structurés, qui est une classe non triviale de transformation d'arbres, de manière supervisée et avec peu d'intervention de l'utilisateur. Les documents semi-structurés ont une structure arborescente. Hors peu de systèmes d'induction supervisée d'extracteurs en tirent partie. La plupart d'entre eux considèrent les documents comme une séquence mélangeant balises et contenu [51, 42, 40, 78, 65]. Plus récemment sont apparus des algorithmes d'induction exploitant pleinement la structure d'arbre des documents semi-structurés [43, 48, 81, 12, 39, 56, 36]. Cette thèse s'inscrit dans ce courant et soutient l'idée que l'exploitation de la structure des documents semi-structurés permet d'induire des extracteurs expressifs et performants. L'induction est réalisée à l'aide d'algorithmes d'apprentissage automatique de classification supervisée. Ce choix est motivé à la fois par le succès des approches d'extractions fondée sur la classification, mais surtout par la volonté d'utiliser des algorithmes d'apprentissage existants et connus. Bien que le codage de exemples d'apprentissage en attribut-valeur prenne en compte la nature arborescente des documents semi-structurés, il est générique et intègre peu de connaissance de base. Cependant toute nouvelle connaissance est facilement intégrable. Notre représentation des données est adaptative. Dans notre approche, l'extraction n-aire est réalisée de manière incrémentale au cours d'une boucle croissante sur la taille des n-uplets. Ce procédé d'extraction ne fait aucune hypothèse sur la disposition des données dans les documents. Aucun post-traitement n'est effectué : notre algorithme réalise en même temps l'extraction des composantes et leur combinaison en n-uplets. Précisons qu'un extracteur obtenu par PaF, notre système, est utilisable tel quel, comme une boite noire, avec en entrée des documents HTML ou XML, et en sortie l'ensemble des n-uplets extraits. De plus le système PaF est implémenté dans un cadre interactif qui permet l'induction à partir d'un faible nombre d'interactions. L'utilisateur fournit quelques annotations qui servent d'amorce à l'apprentissage d'un extracteur hypothèse. Ici commence une boucle d'interaction dans laquelle l'utilisateur corrige les erreurs de l'hypothèse courante et relance l'apprentissage jusqu'à l'obtention d'une hypothèse correcte. PaF permet d'apprendre des extracteurs n-aires performants à partir de peu d'exemples. Les résultats expérimentaux montrent que PaF atteint les performances des meilleurs systèmes n-aires. De plus son procédé d'extraction reste applicable et efficace même lorsque l'organisation des données dans les documents semi-structurés est complexe. L'évaluation expérimentale montre également que le cadre interactif de PaF permet de réduire l'effort d'annotation de l'utilisateur, tout en préservant la qualité des extracteurs induits. [INFO:INFO_OH] Computer Science/Other appretissage automatique classification induction de wrapper

Search results