Global ETD Search

271	Generating Data-Extraction Ontologies By Example Zhou, Yuanqiu 22 November 2005 (has links) (PDF) Ontology-based data-extraction is a resilient web data-extraction approach. A major limitation of this approach is that ontology experts must manually develop and maintain data-extraction ontologies. The limitation prevents ordinary users who have little knowledge of conceptual models from making use of this resilient approach. In this thesis we have designed and implemented a general framework, OntoByE, to generate data-extraction ontologies semi-automatically through a small set of examples collected by users. With the assistance of a limited amount of prior knowledge, experimental evidence shows that OntoByE is capable of interacting with users to generate data-extraction ontologies for domains of interest to them. Ontology Web data data extraction Computer Sciences
272	Comparative Microarray Data Mining Mao, Shihong 27 December 2007 (has links) No description available. Computer Science data mining microarray data comparative
273	An interprocedural framework for data redistributions in distributed memory machines Krishnamurthy, Sudha January 1996 (has links) No description available. Interprocedural Data Redistributions Data Parallelism Functional Parallelism
274	Data Mining over Hidden Data Sources Liu, Tantan 24 August 2012 (has links) No description available. Computer Science Hidden data sources Data mining
275	Data mining in real-world traditional Chinese medicine clinical data warehouse Zhou, X., Liu, B., Zhang, X., Xie, Q., Zhang, R., Wang, Y., Peng, Yonghong January 2014 (has links) No / Real-world clinical setting is the major arena of traditional Chinese medicine (TCM) as it has experienced long-term practical clinical activities, and developed established theoretical knowledge and clinical solutions suitable for personalized treatment. Clinical phenotypes have been the most important features captured by TCM for diagnoses and treatment, which are diverse and dynamically changeable in real-world clinical settings. Together with clinical prescription with multiple herbal ingredients for treatment, TCM clinical activities embody immense valuable data with high dimensionalities for knowledge distilling and hypothesis generation. In China, with the curation of large-scale real-world clinical data from regular clinical activities, transforming the data to clinical insightful knowledge has increasingly been a hot topic in TCM field. This chapter introduces the application of data warehouse techniques and data mining approaches for utilizing real-world TCM clinical data, which is mainly from electronic medical records. The main framework of clinical data mining applications in TCM field is also introduced with emphasizing on related work in this field. The key points and issues to improve the research quality are discussed and future directions are proposed.
276	Clustering of nonstationary data streams: a survey of fuzzy partitional methods Abdullatif, Amr R.A., Masulli, F., Rovetta, S. 20 January 2020 (has links) Yes / Data streams have arisen as a relevant research topic during the past decade. They are real‐time, incremental in nature, temporally ordered, massive, contain outliers, and the objects in a data stream may evolve over time (concept drift). Clustering is often one of the earliest and most important steps in the streaming data analysis workflow. A comprehensive literature is available about stream data clustering; however, less attention is devoted to the fuzzy clustering approach, even though the nonstationary nature of many data streams makes it especially appealing. This survey discusses relevant data stream clustering algorithms focusing mainly on fuzzy methods, including their treatment of outliers and concept drift and shift. / Ministero dell‘Istruzione, dell‘Universitá e della Ricerca. Data streams Fuzzy clustering Nonstationary data Survey
277	Data Sharing and Retrieval of Manufacturing Processes Seth, Avi 28 March 2023 (has links) With Industrial Internet, businesses can pool their resources to acquire large amounts of data that can then be used in machine learning tasks. Despite the potential to speed up training and deployment and improve decision-making through data-sharing, rising privacy concerns are slowing the spread of such technologies. As businesses are naturally protective of their data, this poses a barrier to interoperability. While previous research has focused on privacy-preserving methods, existing works typically consider data that is averaged or randomly sampled by all contributors rather than selecting data that are best suited for a specific downstream learning task. In response to the dearth of efficient data-sharing methods for diverse machine learning tasks in the Industrial Internet, this work presents an end-to end working demonstration of a search engine prototype built on PriED, a task-driven data-sharing approach that enhances the performance of supervised learning by judiciously fusing shared and local participant data. / Master of Science / My work focuses on PriED - a data sharing framework that enhances machine learning performance while also preserving user data privacy. In particular, I have built a working demonstration of a search engine that leverages the PriED framework and allows users to collaborate with their data without compromising their data privacy. data sharing privacy collaboration attention data distillation
278	A Framework for Hadoop Based Digital Libraries of Tweets Bock, Matthew 17 July 2017 (has links) The Digital Library Research Laboratory (DLRL) has collected over 1.5 billion tweets for the Integrated Digital Event Archiving and Library (IDEAL) and Global Event Trend Archive Research (GETAR) projects. Researchers across varying disciplines have an interest in leveraging DLRL's collections of tweets for their own analyses. However, due to the steep learning curve involved with the required tools (Spark, Scala, HBase, etc.), simply converting the Twitter data into a workable format can be a cumbersome task in itself. This prompted the effort to build a framework that will help in developing code to analyze the Twitter data, run on arbitrary tweet collections, and enable developers to leverage projects designed with this general use in mind. The intent of this thesis work is to create an extensible framework of tools and data structures to represent Twitter data at a higher level and eliminate the need to work with raw text, so as to make the development of new analytics tools faster, easier, and more efficient. To represent this data, several data structures were designed to operate on top of the Hadoop and Spark libraries of tools. The first set of data structures is an abstract representation of a tweet at a basic level, as well as several concrete implementations which represent varying levels of detail to correspond with common sources of tweet data. The second major data structure is a collection structure designed to represent collections of tweet data structures and provide ways to filter, clean, and process the collections. All of these data structures went through an iterative design process based on the needs of the developers. The effectiveness of this effort was demonstrated in four distinct case studies. In the first case study, the framework was used to build a new tool that selects Twitter data from DLRL's archive of tweets, cleans those tweets, and performs sentiment analysis within the topics of a collection's topic model. The second case study applies the provided tools for the purpose of sociolinguistic studies. The third case study explores large datasets to accumulate all possible analyses on the datasets. The fourth case study builds metadata by expanding the shortened URLs contained in the tweets and storing them as metadata about the collections. The framework proved to be useful and cut development time for all four of the case studies. / Master of Science / The Digital Library Research Laboratory (DLRL) has collected over 1.5 billion tweets for the Integrated Digital Event Archiving and Library (IDEAL) and Global Event Trend Archive Research (GETAR) projects. Researchers across varying disciplines have an interest in leveraging DLRL’s collections of tweets for their own analyses. However, due to the steep learning curve involved with the required tools, simply converting the Twitter data into a workable format can be a cumbersome task in itself. This prompted the effort to build a programming framework that will help in developing code to analyze the Twitter data, run on arbitrary tweet collections, and enable developers to leverage projects designed with this general use in mind. The intent of this thesis work is to create an extensible framework of tools and data structures to represent Twitter data at a higher level and eliminate the need to work with raw text, so as to make the development of new analytics tools faster, easier, and more efficient. The effectiveness of this effort was demonstrated in four distinct case studies. In the first case study, the framework was used to build a new tool that selects Twitter data from DLRL’s archive of tweets, cleans those tweets, and performs sentiment analysis within the topics of a collection’s topic model. The second case study applies the provided tools for the purpose of sociolinguistic studies. The third case study explores large datasets to accumulate all possible analyses on the datasets. The fourth case study builds metadata by expanding the shortened URLs contained in the tweets and storing them as metadata about the collections. The framework proved to be useful and cut development time for all four of the case studies. big data digital libraries data structures
279	Big data, data mining, and machine learning: value creation for business leaders and practitioners Dean, J. January 2014 (has links) No / Big data is big business. But having the data and the computational power to process it isn't nearly enough to produce meaningful results. Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners is a complete resource for technology and marketing executives looking to cut through the hype and produce real results that hit the bottom line. Providing an engaging, thorough overview of the current state of big data analytics and the growing trend toward high performance computing architectures, the book is a detail-driven look into how big data analytics can be leveraged to foster positive change and drive efficiency. With continued exponential growth in data and ever more competitive markets, businesses must adapt quickly to gain every competitive advantage available. Big data analytics can serve as the linchpin for initiatives that drive business, but only if the underlying technology and analysis is fully understood and appreciated by engaged stakeholders. Big data Analytics Data mining Machine learning
280	What does Big Data has in-store for organisations: An Executive Management Perspective Hussain, Zahid I., Asad, M., Alketbi, R. January 2017 (has links) No / With a cornucopia of literature on Big Data and Data Analytics it has become a recent buzzword. The literature is full of hymns of praise for big data, and its potential applications. However, some of the latest published material exposes the challenges involved in implementing Big Data (BD) approach, where the uncertainty surrounding its applications is rendering it ineffective. The paper looks at the mind-sets and perspective of executives and their plans for using Big Data for decision making. Our data collection involved interviewing senior executives from a number of world class organisations in order to determine their understanding of big data, its limitations and applications. By using the information gathered by this is used to analyse how well executives understand big data and how well organisations are ready to use it effectively for decision making. The aim is to provide a realistic outlook on the usefulness of this technology and help organisations to make suitable and realistic decisions on its investment. Professionals and academics are becoming increasingly interested in the field of big data (BD) and data analytics. Companies invest heavily into acquiring data, and analysing it. More recently the focus has switched towards data available through the internet which appears to have brought about new data collection opportunities. As the smartphone market developed further, data sources extended to include those from mobile and sensor networks. Consequently, organisations started using the data and analysing it. Thus, the field of business intelligence emerged, which deals with gathering data, and analysing it to gain insights and use them to make decisions (Chen, et al., 2012). BD is seem to have a huge immense potential to provide powerful information businesses. Accenture claims (2015) that organisations are extremely satisfied with their BD projects concerned with enhancing their customer reach. Davenport (2006) has presented applications in which companies are using the power of data analytics to consistently predict behaviours and develop applications that enable them to unearth important yet difficult to see customer preferences, and evolve rapidly to generate revenues. Big data Data analytics Decision making

Search results