Spelling suggestions: "subject:"beta data"" "subject:"meta data""
21 |
Identifying Security Requirements using Meta-Data and Dependency HeuristicsMahakala, Kavya Reddy January 2018 (has links)
No description available.
|
22 |
Integration of Heterogeneous Databases: Discovery of Meta-Information and Maintenance of Schema-Restructuring ViewsKoeller, Andreas 15 April 2002 (has links)
In today's networked world, information is widely distributed across many independent databases in heterogeneous formats. Integrating such information is a difficult task and has been adressed by several projects. However, previous integration solutions, such as the EVE-Project, have several shortcomings. Database contents and structure change frequently, and users often have incomplete information about the data content and structure of the databases they use. When information from several such insufficiently described sources is to be extracted and integrated, two problems have to be solved: How can we discover the structure and contents of and interrelationships among unknown databases, and how can we provide durable integration views over several such databases? In this dissertation, we have developed solutions for those key problems in information integration. The first part of the dissertation addresses the fact that knowledge about the interrelationships between databases is essential for any attempt at solving the information integration problem. We are presenting an algorithm called FIND2 based on the clique-finding problem in graphs and k-uniform hypergraphs to discover redundancy relationships between two relations. Furthermore, the algorithm is enhanced by heuristics that significantly reduce the search space when necessary. Extensive experimental studies on the algorithm both with and without heuristics illustrate its effectiveness on a variety of real-world data sets. The second part of the dissertation addresses the durable view problem and presents the first algorithm for incremental view maintenance in schema-restructuring views. Such views are essential for the integration of heterogeneous databases. They are typically defined in schema-restructuring query languages like SchemaSQL, which can transform schema into data and vice versa, making traditional view maintenance based on differential queries impossible. Based on an existing algebra for SchemaSQL, we present an update propagation algorithm that propagates updates along the query algebra tree and prove its correctness. We also propose optimizations on our algorithm and present experimental results showing its benefits over view recomputation.
|
23 |
Raffinement de la localisation d’images provenant de sites participatifs pour la mise à jour de SIG urbain / Refining participative website’s images localization for urban GIS updatesSemaan, Bernard 14 December 2018 (has links)
Les villes sont des zones actives : tous les jours de nouvelles constructions ont lieu, des immeubles sont démolis ou des locaux commerciaux changent d'enseigne. Les gestionnaires des Systèmes d'information géographique de la ville ont pour but de mettre le plus à jour possible leurs modèles numériques de la ville. Ces modèles peuvent se composer de cartes en 2D mais aussi de modèles en 3D qui peuvent provenir d'une reconstruction à partir d'images. Ces dernières peuvent avoir été prises depuis le ciel comme depuis le sol. La cartographie participative, comme le permet la plateforme "OpenStreetMap.org", a émergé pour mettre à disposition de tous l'information géographique et maintenir les cartes 2D à jour par les utilisateurs de la plateforme. Dans le but d'améliorer le processus de mise à jour et suivant le même esprit que les approches participatives, nous proposons d'utiliser les plateformes de partage de photos comme "Flickr", "Twitter", etc. Les images téléchargées sur ces plates-formes possèdent une localisation imprécise de l'image sans information sur l'orientation de la photographie. Nous proposons alors un système qui aide à trouver une meilleure localisation et retrouve une information d'orientation de la photographie. Le système utilise les informations visuelles de l'image ainsi que les informations sémantiques. Pour cela nous présentons une chaîne de traitement automatisée composée de trois couches : la couche d'extraction et de pré-traitement des données, la couche d'extraction et de traitement des primitives, la couche de prise de décision. Nous présentons ensuite les résultats de l'ensemble de ce système que nous appelons "Data Gathering system for image Pose Estimation"(DGPE). Nous présentons aussi dans cette thèse une méthode que nous avons appelée "Segments Based Building Detection"(SBBD) pour la détection d'immeubles simples. Nous avons aussi testé cette méthode sous diverses conditions de prise de vue (occultations, variations climatiques, etc.). Nous comparons cette méthode de détection à une autre méthode de l'état de l'art en utilisant plusieurs bases d'images. / Cities are active spots in the earth globe. They are in constant change. New building constructions, demolitions and business changes may apply on daily basis. City managers aim to keep as much as possible an updated digital model of the city. The model may consist of 2D maps but may also be a 3D reconstruction or a street imagery sequence. In order to share the geographical information and keep a 2D map updated, collaborative cartography was born. "OpenStreetMap.org" platform is one of the most known platforms in this field. In order to create an active collaborative database of street imagery we suggest using 2D images available on image sharing platforms like "Flickr", "Twitter", etc. Images downloaded from such platforms feature a rough localization and no orientation information. We propose a system that helps finding a better localization of the images and providing an information about the camera orientation they were shot with. The system uses both visual and semantic information existing in a single image. To do that, we present a fully automatic processing chain composed of three main layers: Data retrieval and preprocessing layer, Features extraction layer, Decision Making layer. We then present the whole system results combining both semantic and visual information processing results. We call our system Data Gathering system for image Pose Estimation (DGPE). We also present a new automatic method for simple architecture building detection we have developed and used in our system. This method is based on segments detected in the image and was called Segments Based Building Detection (SBBD). We test our method against some weather changes and occlusion problems. We finally compare our building detection results with another state-of-the-art method using several images databases.
|
24 |
Increasing Design Productivity for FPGAs Through IP Reuse and Meta-Data EncapsulationArnesen, Adam T. 17 March 2011 (has links) (PDF)
As Moore's law continues to progress, it is becoming increasingly difficult for hardware designers to fully utilize the increasing number of transistors available semiconductor devices including FPGAs. This design productivity gap must be addressed to allow designs to take full advantage of the increased logic density that results from rising transistor density. The reuse of previously developed and verified intellectual property (IP) is one approach that has claimed to narrow the design productivity gap. Reuse, however, has proved difficult to realize in practice because of the complexity of IP and the reluctance of designers to reuse IP that they do not understand. This thesis proposes to narrow the design productivity gap for FPGAs by simplifying the reuse problem by encapsulating IP with extra machine-readable information or meta-data. This meta-data simplifies reuse by providing a language independent format for composing complex systems, providing a parameter representation system, defining high-level data types for FPGA IP, and allowing arbitrary IP to be described as actors in the homogeneous synchronous dataflow model of computation.This work implements meta-data in XML and presents two XML schemas that enable reuse. A new XML schema known as CHREC XML is presented as well as extensions that enable IP-XACT to be used to describe FPGA dataflow IP. Two tools developed in this work are also presented that leverage meta-data to simplify reuse of arbitrary IP. These tools simplify structural composition of IP, allow designers to manipulate parameters, check and validate high-level data types, and automatically synthesize control circuitry for dataflow designs. Productivity improvements are also demonstrated by reusing IP to quickly compose software radio receivers.
|
25 |
Design and implementation of a workflow for quality improvement of the metadata of scientific publicationsWolff, Stefan 07 November 2023 (has links)
In this paper, a detailed workflow for analyzing and improving the quality of metadata of scientific publications is presented and tested.
The workflow was developed based on approaches from the literature. Frequently occurring types of errors from the literature were compiled and mapped to the data-quality dimensions most relevant for publication data – completeness, correctness, and consistency – and made measurable. Based on the identified data errors, a process for improving data quality was developed. This process includes parsing hidden data, correcting incorrectly formatted attribute values, enriching with external data, carrying out deduplication, and filtering erroneous records.
The effectiveness of the workflow was confirmed in an exemplary application to publication data from Open Researcher and Contributor ID (ORCID), with 56\% of the identified data errors corrected. The workflow will be applied to publication data from other source systems in the future to further increase its performance.
|
26 |
中醫醫藥典籍中之Metadata的初探─以「本草備要」、「醫方集解」為例 / A Preliminary Study on Metadata in Chinese Medicines Literatures – on Examples of “Ben Cao Bei Yao” and “Yi Fang Ji Jie”吳俊德 Unknown Date (has links)
本研究之方向係探究建置大型中醫藥倉儲所需之後設資料(Metadata),並透過此一初探,瞭解與描述其中所需之分析方法,本研究設想Zachman Framework為合適之資料倉儲開發方法,因而由5W1H面向來衍生該專業領域所需之概念。此類概念可再透過一分析程序,確立後設資料。
因時間之限制,本研究採用「本草備要」、「醫方集解」為範例文件進行相關分析,以減少中醫流派林立及中文本身不準確性帶來之問題,當然,本研究在其中亦力求在整體架構上維持其他中醫藥典籍之適用性。
為達成目標,本研究首先探討了至今中草藥資料庫、資料倉儲、電子超文件領域之發展,因而本研究決定將個別之中醫典籍視為「資料專櫃」,而將分類樹、Metadata描述性資料置於目次(catalog)的概念之下,這樣的做法有利於整合其他典籍及其後設資料於大型資料倉儲中。
首先,本研究由重要中醫藥典籍導出基礎性中草藥概念與名詞,其後透過典藏面及應用面之統計分析,確認範例典籍中的Metadata。在實作方面,本研究以BNF來描述和定義Metadata,並以XML為工具完成雛型以供測試之。其中,本研究發現,基於資料倉儲觀點所擷取之後設資料的分析單位較傳統圖書典藏所得之為小。此外,本研究擷取過程中所涉及之Metadata,以功能性者為多,本研究亦採取了若干語言分析以期同時能維持典籍之文字結構。 / The objective of this research work is to acquisit and design Metadata for the construction of data warehouse of Traditional Chinese Medicine (TCM) literatures in the context of knowledge management. In order to solve the problem of preservation and utilization of TCM literatures, this work aims to designate the Metadata based on the viewpoint of knowledge engineering and data warehouse.
In this work, the characteristics of the TCM regarding Metadata result in the 5W1H’s principle, while this work argues for its advantages for deriving more functional descriptions and keeping the syntax structure of the originals at the same time. To minimize the constraints of time, this work chooses “Ben Cao Bei Yao” and “Yi Fang Ji Jie” as the target to analyze.
In constructing a prototype, the tacit knowledge in the example TCM literatures is converted through an analytic process explicitly into the organizational knowledge that can be easily preserved and processed by machines. Therefore, a statistical process is employed to derive and verify the Metadata in the context of the example TCM literatures. Then, the components regarding the Metadata are implemented with XML tools to develop the prototype.
Last but not the least, this work presents its findings as follows:
1. The unit of analysis for deriving Metadata related to data warehouse is usually in a smaller degree of finesse in comparison to what is addressed in the area of traditional library management.
2. Through the Metadata derived in this work based on a data warehouse approach presents more functional elements, we can still maintain the linguistic structure of the example literatures with some careful linguistically analyses in the last step.
|
27 |
A framework for semantic web implementation based on context-oriented controlled automatic annotation.Hatem, Muna Salman January 2009 (has links)
The Semantic Web is the vision of the future Web. Its aim is to enable machines to process Web documents in a way that makes it possible for the computer software to "understand" the meaning of the document contents. Each document on the Semantic Web is to be enriched with meta-data that express the semantics of its contents. Many infrastructures, technologies and standards have been developed and have proven their theoretical use for the Semantic Web, yet very few applications have been created. Most of the current Semantic Web applications were developed for research purposes. This project investigates the major factors restricting the wide spread of Semantic Web applications. We identify the two most important requirements for a successful implementation as the automatic production of the semantically annotated document, and the creation and maintenance of semantic based knowledge base.
This research proposes a framework for Semantic Web implementation based on context-oriented controlled automatic Annotation; for short, we called the framework the Semantic Web Implementation Framework (SWIF) and the system that implements this framework the Semantic Web Implementation System (SWIS). The proposed architecture provides for a Semantic Web implementation of stand-alone websites that automatically annotates Web pages before being uploaded to the Intranet or Internet, and maintains persistent storage of Resource Description Framework (RDF) data for both the domain memory, denoted by Control Knowledge, and the meta-data of the Web site¿s pages. We believe that the presented implementation of the major parts of SWIS introduce a competitive system with current state of art Annotation tools and knowledge management systems; this is because it handles input documents in the
ii
context in which they are created in addition to the automatic learning and verification of knowledge using only the available computerized corporate databases. In this work, we introduce the concept of Control Knowledge (CK) that represents the application¿s domain memory and use it to verify the extracted knowledge. Learning is based on the number of occurrences of the same piece of information in different documents. We introduce the concept of Verifiability in the context of Annotation by comparing the extracted text¿s meaning with the information in the CK and the use of the proposed database table Verifiability_Tab. We use the linguistic concept Thematic Role in investigating and identifying the correct meaning of words in text documents, this helps correct relation extraction. The verb lexicon used contains the argument structure of each verb together with the thematic structure of the arguments. We also introduce a new method to chunk conjoined statements and identify the missing subject of the produced clauses. We use the semantic class of verbs that relates a list of verbs to a single property in the ontology, which helps in disambiguating the verb in the input text to enable better information extraction and Annotation. Consequently we propose the following definition for the annotated document or what is sometimes called the ¿Intelligent Document¿ ¿The Intelligent Document is the document that clearly expresses its syntax and semantics for human use and software automation¿.
This work introduces a promising improvement to the quality of the automatically generated annotated document and the quality of the automatically extracted information in the knowledge base. Our approach in the area of using Semantic Web
iii
technology opens new opportunities for diverse areas of applications. E-Learning applications can be greatly improved and become more effective.
|
28 |
A framework for semantic web implementation based on context-oriented controlled automatic annotationHatem, Muna Salman January 2009 (has links)
The Semantic Web is the vision of the future Web. Its aim is to enable machines to process Web documents in a way that makes it possible for the computer software to "understand" the meaning of the document contents. Each document on the Semantic Web is to be enriched with meta-data that express the semantics of its contents. Many infrastructures, technologies and standards have been developed and have proven their theoretical use for the Semantic Web, yet very few applications have been created. Most of the current Semantic Web applications were developed for research purposes. This project investigates the major factors restricting the wide spread of Semantic Web applications. We identify the two most important requirements for a successful implementation as the automatic production of the semantically annotated document, and the creation and maintenance of semantic based knowledge base. This research proposes a framework for Semantic Web implementation based on context-oriented controlled automatic Annotation; for short, we called the framework the Semantic Web Implementation Framework (SWIF) and the system that implements this framework the Semantic Web Implementation System (SWIS). The proposed architecture provides for a Semantic Web implementation of stand-alone websites that automatically annotates Web pages before being uploaded to the Intranet or Internet, and maintains persistent storage of Resource Description Framework (RDF) data for both the domain memory, denoted by Control Knowledge, and the meta-data of the Web site's pages. We believe that the presented implementation of the major parts of SWIS introduce a competitive system with current state of art Annotation tools and knowledge management systems; this is because it handles input documents in the ii context in which they are created in addition to the automatic learning and verification of knowledge using only the available computerized corporate databases. In this work, we introduce the concept of Control Knowledge (CK) that represents the application's domain memory and use it to verify the extracted knowledge. Learning is based on the number of occurrences of the same piece of information in different documents. We introduce the concept of Verifiability in the context of Annotation by comparing the extracted text's meaning with the information in the CK and the use of the proposed database table Verifiability_Tab. We use the linguistic concept Thematic Role in investigating and identifying the correct meaning of words in text documents, this helps correct relation extraction. The verb lexicon used contains the argument structure of each verb together with the thematic structure of the arguments. We also introduce a new method to chunk conjoined statements and identify the missing subject of the produced clauses. We use the semantic class of verbs that relates a list of verbs to a single property in the ontology, which helps in disambiguating the verb in the input text to enable better information extraction and Annotation. Consequently we propose the following definition for the annotated document or what is sometimes called the 'Intelligent Document' 'The Intelligent Document is the document that clearly expresses its syntax and semantics for human use and software automation'. This work introduces a promising improvement to the quality of the automatically generated annotated document and the quality of the automatically extracted information in the knowledge base. Our approach in the area of using Semantic Web iii technology opens new opportunities for diverse areas of applications. E-Learning applications can be greatly improved and become more effective.
|
29 |
Proveniensprincipen : Vara eller icke vara - det är frågan i en digitaliseradinformationsförvaltningNilsson, Marita January 2018 (has links)
Detta forskningsarbete lyfter den problematik somdebatterats kring proveniensprincipen och den ombildningdenna princip har mött sedan digitaliserings ankomst.Studiens avsikt var att påvisa vilken innebörd principen haridag i en modern informationsförvaltning och deninformationshantering som sker där. Syftet var även attundersöka hur informationsförvaltningen arbetar proaktivtmed att garantera proveniens i all sin informationshantering,samt belysa hur proveniens förstås i förhållande till valet avmetod kring informationshanteringen.Undersökningen var kvalitativ och utfördes på tiokommunarkiv i form av att varje kommuns kommunarkivariedjupintervjuades. I undersökningen har även planer kringinformationshantering en studerats. Studien konstaterar vilkaförenklingar som digitaliseringen inneburit kring att säkerställa proveniens, där automatiserad och utvecklad metadataskapat verklig proveniens som kan påvisa informationenssamband med den process och det sammanhang där den harbefunnit sig. Uppsatsen diskuterar även de bekymmer somuppstår då digitaliserad information ordnas på helt andra sättän tidigare och vilka konsekvenser detta får för hur vi skaförhålla oss till och förstå proveniens.Resultatet visar att informationsförvaltningarna kan borga föryttre proveniens vad gäller arkivmaterialet men inte helahandlingsbeståndet. Studien fastslår vidare att inreproveniens som en spegling av organisationens verksamhetmåste förstås utifrån hela handlingsbeståndet och desslogiska ordning, snarare än utifrån arkivmaterialets synliga.Undersökningen konstaterar även betydelsen av proaktivitetkring arbetet med att tydliggöra informationens processuellakontext, samt tidig metadataapplicering ochsystemutveckling som behåller metadata genom allaprocesser. Uppsatsen understryker slutligen att detta inte görsi den utsträckning som är nödvändig. / This essay describes the debate about the principle ofprovenance and its multiple forms, and the transformationsof these forms, due to the coming of electronic informations.The thesis intended to explain the definitions of the principlein a modern information management and there explore howthey operate proactively to assure provenance.The qualitative investigation was carried out at tenmunicipality final archives, where each municipalityarchivist was being interviewed. The study expounds in whatway the digitisation has simplified the methods to conductassured provenance, where automated metadata shows therelationships of the information to function and process. Theessay also debates the difficulties that appear when digitalinformation are being organized in different ways thananalogue information, and how this fact requires a newinterpretation of the principle of provenance.The researcher concludes that the investigated archives,ensure respect des fonds when it concerns the content of thearchives, but not when it comes to the whole content of theinformation management. The result of the study also showsthat the respect of original order as a reflection of theorganization, has to be understood throughout all content ofthe management and its logical order, rather than the visiblecontent that the archives embrace. Furthermore the thesisobserves the importance of proactivity, regarding theclarification of the relationships between the information andthe processes that produce and use them. This could beachieved with early application of metadata and developmentof systems that keep metadata trough all processes. Theconclusion of the essay is that this is not pursued in theextension that is required.
|
30 |
Utbildningsmaterial ur mjukvarudokumentation / Educational material from software documentationHögman Ordning, Herman January 2019 (has links)
Utbildning för slutanvändare av IT-system på arbetsplatsen är en dyr och tidskrävande affär. Trots att mycket information om systemen tas fram under utvecklingenav systemet, i form av kravdokument och annan dokumentation, används den informationen sällan i utbildningssyfte. Företaget Multisoft önskadeundersöka hur den dokumentation de tar fram under utvecklingen av skräddarsyddaverksamhetssystem åt olika företag, kallade Softadmin R -system, kananvändas i utbildningssyfte.Syftet med detta examensarbete var att identifiera vilka lärbehov slutanvändareav verksamhetssystem utvecklade av företaget Multisoft har. Baserat på dessa lärbehov undersöktes hur den dokumentation som tas fram under utvecklingenskulle kunna nyttjas i utbildningssyfte. En kvalitativ undersökning med narrativa semistrukturerade intervjuer genomfördes med slutanvändare och projektdeltagare hos sex olika företag som implementeratett Softadmin R -system inom de senaste två åren. Projektdeltagarna hade varit involverade under utvecklingen från kundföretagets sida och slutanvändarnahade inte varit involverade i utvecklingen. Tio intervjuer genomfördes och en tematisk analys utfördes på intervjutranskriptionerna. Framtagna teman tolkades utifrån en kognitivistisk syn på lärande. Resultatet från analys av intervjuerna pekar på att slutanvändare vill kunna lära sig genom att testa sig runt i systemet. Slutanvändare vill lära sig genomatt få information visuellt och inte enbart via text. Ett utbildningsmaterial om ett Softadmin R -system ska inte kräva tidigare kunskap om systemet för attvara tillgängligt. Vidare indikerar resultatet att slutanvändare upplever att systemen har en komplex affärslogik där det är svårt att förstå hur systemet solika processer påverkar varandra. Övergången från ett gammalt system till det nya kan innebära svårigheter i lärandet för slutanvändarna. Avsaknad avstruktur då slutanvändarna lärde sig använda systemet identifierades som ett problem. Ett förslag på struktur för ett utbildningsmaterial har tagits fram. Detta utbildivningsmaterial är tänkt att använda information från den dokumentation som tas fram under utvecklingen av Softadmin R -system. Denna användning av dokumentationen skulle i nuläget behöva göras manuellt med viss anpassning. Förslag på hur det kan automatiseras har presenterats. Funktionella krav på ett system för framtagning och underhåll av den informationsom krävs för det föreslagna utbildningsmaterialet har presenterats. När Softadmin R -system som utbildningsmaterialet berör uppdateras möjliggörsystemet uppdatering av utbildningsmaterialet. Systemet möjliggör även framtagning av utbildningsmaterial anpassat för en viss yrkesroll. / End user training of IT systems at the workplace is an expensive and time consuming ordeal. Despite a lot of information about the systems being produced during development of the system, in the form of requirement documents and other documentation, the information is seldom used for educational purposes. Multisoft wished to explore how the documentation produced during the development of their tailor-made business systems, named Softadmin R systems, can be used for educational purposes.The purpose of this master thesis was to identify what learning-needs end users have in regards to business systems developed by the company Multisoft. Based on these learning-needs an investigation would be placed on how the documentation produced during development could be used for educational purposes. A qualitative study with narrative semi-structured interviews was conducted with end users and project participants at six different companies that had implemented a Softadmin R system at their workplace within the last two years. The project participants had been involved from the customer company’s side during the development whereas the end users had not been involved. Ten interviews were conducted and a the matic analysis was performed on the interview transcripts. Procured themes were then interpreted from a cognitiveperspective on learning. The results indicated that end users want to be able to learn by trying to use the system themselves. End users want to learn by getting information visually and not only via text. A training material for a Softadmin R system should not require prior knowledge about the system to be available to the learner. Furthermore the results indicate that end users feel the systems have a complex business logic where it is difficult to understand how the different processes in the system affect each other. The transition from an old system to a new system can be problematic to the end users’ learning. A lack of structure in the end users’ learning of the system was identified as an issue.
|
Page generated in 0.0753 seconds